Sunday, May 27, 2018

Streaming Live Tweets from Twitter to CosmosDB


This is time for another blog on cosmosdb explaining how to stream tweets from twitter using hashtags and store them in cosmosdb in real time. You should be able to setup and run this demo within 15 minutes.

Pre-Requisites Needed:

I have the following in my local environment , hope you guys have already have😊, if not start setting up.

·                  Windows 10 OS
·                  Python 2.7
·                  Visual Studio Code or PyCharm (Any editor)
·                  Azure subscription


Ok folks let’s get started.

Step 1: Install Python

Hope you have already installed Python in your system , if not download and install from here. Once you install run the following command and see if its properly installed.

Step 2: Install Tweepy and PyDocumentDB

Install the following libraries needed. 

Tweepy:

Tweepy is a python package which is easy to use for accessing the twitter api. he API class provides access to the entire twitter RESTful API methods. Each method can accept various parameters and return responses. Install it with the following command,

 Pip install tweepy  

If you get an error 'pip' is not recognized as an internal or external command. You should set the path as follows,

C:\>set PATH=%PATH%;C:\Python27\Scripts

Now you should be able to install it without any issue,

Pydocumentdb:

As mentioned above we will be storing the tweets in Azure’s cosmosdb , In order to do that we need the python package for cosmosdb which is pydocumentdb. Install it with the following command. 


Pip install pydocumentdb

Now we have everything needed. Lets dive into coding.


Step 3:  Creating Listener to invoke the cosmosdb client

Create a listener named CosmosDBListener with the following methods

__init__ Initializes the client to make sure the connection is available.

On_data will load the data retrieved from the stream and write to the Cosmosdb.

On_error will throw if there is any network/key issues on console.

from config import *
import json
from tweepy.streaming import StreamListener

class CosmosDBListener(StreamListener):
 
    def __init__(self, client, collLink):
        self.client = client
        self.collLink = collLink
        
    def on_data(self, data):
        try:
            dictData = json.loads(data)
            dictData["id"] = str(dictData["id"])
            self.client.CreateDocument(self.collLink, dictData)
            return True
        except BaseException as e:
            print("Error on data: %s" % str(e))
        return True
 
    def on_error(self, status):
        print(status)
        return True


Step 4: Stream data from twitter to cosmosDB

Lets create the real code to connect to twitter and get the related tweets for several hashtags. We will need to authenticate with tweepy to get the twets, so pass the consumer secret and access secret to the api as follows.

    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_secret)
    api = tweepy.API(auth)

Set the connection policy for cosmosdb and create a client as follows,
    connectionPolicy = documents.ConnectionPolicy()
    connectionPolicy.EnableEndpointDiscovery 
    connectionPolicy.PreferredLocations = preferredLocations

Next step is to read the tweets as follows , we are using .filter method to get tweets related to particular hashtags.

client = document_client.DocumentClient(host, {'masterKey': masterKey}, connectionPolicy)
dbLink = 'dbs/' + databaseId
collLink = dbLink + '/colls/' + collectionId

twitter_stream = Stream(auth, CosmosDBListener(client, collLink))
twitter_stream.filter(track=['#CosmosDB', '#Microsoft', '#MVP', '#BigData', '#DataScience', '#Mongo', '#Graph'], async=True)


Step 5: Creating configuration file

Create the config file with the following values,

# Enter CosmosDB config details below.
masterKey = ' ' 
host = ' '

#Enter your database, collection and preferredLocations here.
databaseId = 'tweepyDemo'
collectionId = 'tweets'
preferredLocations = ''

# Enter twitter OAuth keys here.
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''

 You need to have CosmosDB account on azure to get the master key and host values, if you are stuck , read my previous blog on How to setup cosmosdb account


You also need to register the script as a new application at twitter developer portal. After choosing a name and application for your app, you will be provided with a  consumer key , Consumer secret, access token and access token secret - which need to be filled into  the above config.py to provide the app programmatic access to Twitter.




Step 6: Run the script

That’s it folks now if you goto command prompt and run the following command,

py cosmosdbdriver.py


You should see the tweets coming into your cosmosdb collection as follows.



Tweets you need are now in your cosmosdb and use them for further analysis as you need. Hope it helps someone out there. If you are stuck at anypoint, look at the complete code from here.

Thursday, April 5, 2018

How to Import database from SQL Azure to local environment

One of the most frequent thing that developers always wanted to have a copy of the development database in local. In this blog i will pen down the steps on how to export and import a database from SQL azure instance to local machine and restore it on SQL server.

Prerequisites:

You will need an Azure account and get the credentials  from Azure web portal. 

Step 1:


Get the backup from the azure instance as follows,  Select the database → Right click → Tasks →  Export Data Tier Application.


Step 2:
Give a specific name for the backup file and save it in your desired location as follows,
Step 3: That's it you have taken a backup of the database from sql instance to your local. Lets restore it to the local. Copy the backed up database to your C drive. Now open the
PowerShell with administrator rights and navigate to C drive


Step 4: Lets download the powershell script to remove the master keyRemoveMasterKey.ps1have the script on the same drive in this case its C.

Step 5 : Run the script as follows,
 .\RemoveMasterKey.ps1 -bacpacPath "C:\identity.bacpac"
That's it, now you can restore it on MSSQL 2017 in your local environment.
Step 6: Connect to your local server, and click Databases → Import-Data-Tier-Application
Step 7 : Give a name for your database to restore. 



Now you will see everything in green!


That's it folks, now you should be easily able to restore your development database in your local environment.


Sunday, March 25, 2018

Wear out the features of Azure CosmosDB with AspNetCore application


Azure CosmosDB (Azure Cosmos DB – Globally Distributed Database Service (formerly DocumentDB) | Microsoft Azure) is a super set of the service once known as “Azure Document Db”. In short: “Azure CosmosDB ” = “Azure Document Db” + new data types + new APIs.


You can try CosmosDB  for free on Azure or you can setup the CosmosDB on your local environment by following my previous blog. I am becoming a fan of .NET Core with all the features and it is getting better day by day . In this blog post i just wanted to take that initial steps of how to work with CosmosDB from .NET Core Client context. After reading this blog, you should be able to do the following with CosmosDB programmatically,

  • Create Database
  • Create Collection
  • Create Documents  
  • Query a Document  
  • Delete Database  
Pre-Requisities Needed:

I have the following in my local environment , hope you guys have already have😊, if not start setting up.
  • Windows 10 OS
  • Azure CosmosDB Emulator
  • Visual Studio Code editor with C# plugin
  • .NET Core 2.0
Ok folks, lets get started.

Step 1: Create .Net Core Console Application : 
As other tutorials, to make it simple I will be creating a dotnetcore console app to work with CosmosDB . With Net Core , we now  have a CLI. Lets create the new app with the following steps. (I’ve mentioned in the previous blog)
  1. Open command prompt or poweshell (Administrator Mode)
  2. Navigate to your folder where you need to create the app
  3. Execute the following command
dotnet new console -n CosmosCoreClient -o CosmosCoreClient

here -n denotes the name of the application, and -o tells the CLI to create a folder with that name and create the application inside the folder


Open the newly created project in Visual Studio Code. Execute the following command
Code.


Here is a screenshot of how it should look on your end:



I am using C# 7.1 feature to create a async Main method in my console app. For that, we will need to make a small change in our project file a little. Open CosmosDBClient.csproj file to edit. Add the following XML node to PropertyGroup node.

<LangVersion>latest</LangVersion>

After changes, your csproj file should look like below:


Lets move to the core part of integrating CosmosDB with .netCore application and start building the features.

Step 2: Add CosmosDB Nuget Package

If you have followed the above steps, we have successfully created the application, next is to add reference to CosmosDB nuget package to get the client libraries. Advantage of these packages/libraries are, they make it easy to work with Cosmosdb.
  1. Open a command prompt and navigate to root of your project.
  2. Execute the following command
dotnet add package Microsoft.Azure.DocumentDB.Core


You might wonder the namespace has DocumentDB in it. In fact DocumetDB is where the whole journey started and hence the name sticks in Cosmos world too. If you now look at the project file a new reference for DocumentDB would have been added. Here is the screenshot of my project file.


Step 3: Creating Model for CosmosDB

Lets build the database. If you are new to CosmosDB you should know that CosmosDB has a query playground here https://www.documentdb.com/sql/demo. It is a sandboxed environment with couple of databases and you can try around with different queries you can write against the database. For this post, lets create the database named Course locally.

Since we our application is to deal with the Courses we need 4 Models here.
  1. Course
  2. Session
  3. Teacher
  4. Student
Here are the Models of the above 4.

Course.cs

using Microsoft.Azure.Documents;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
public class Course : Document
{
    [JsonProperty(PropertyName = "CourseId")]
    public Guid CourseId { get; set; }

    [JsonProperty(PropertyName = "Name")]
    public string Name
    {
        get
        {
            return GetPropertyValue<string>("Name");
        }
        set
        {
            SetPropertyValue("Name", value);
        }
    }

    [JsonProperty(PropertyName = "Sessions")]
    public List<Session> Sessions { get; set; }

    [JsonProperty(PropertyName = "Teacher")]
    public Teacher Teacher { get; set; }

    [JsonProperty(PropertyName = "Students")]
    public List<Student> Students { get; set; }
}

Session.cs

using System;

public class Session
{
    public Guid SessionId { get; set; }

    public string Name { get; set; }

    public int MaterialsCount { get; set; }
}

Teacher.cs


using System;

public class Teacher
{
    public Guid TeacherId { get; set; }

    public string FullName { get; set; }

    public int Age { get; set; }
}

Student.cs


using System;

public class Student
{
    public Guid StudentId { get; set; }
    public string FullName { get; set; }

}

Lets create the Client as the next step.

Step 4: Creating the Client

Next step you will need to instantiate the CosmosDb client before we do anything with the database. In order to connect to the local instance of the cosmosDb, we need to configure 2 things,

  1. URL of the CosmosDb instane
  2. Authentication key needed to authenticate.
As stated above, When you start the CosmosDb  local emulator, the db instance is available at https://localhost:8081. The authkey for local emulator is a static key and you can find it here in this article(https://docs.microsoft.com/en-us/azure/cosmos-db/local-emulator#authenticating-requests). This key works only with the local emulator and wont work with your Azure instance, you can find the key if you are using azure instance from the portal as mentioned in the answer. Here is the code snippet to instantiate the client:


static string endpointUri = "https://localhost:8081";
        static string authKey = "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==";
        string dbName = "CourseDB";
        string collectionName = "Courses";
        static void Main(string[] args)
        {
            Console.WriteLine("Press any key to run");
            Console.ReadLine();

            Run();

            Console.ReadLine();

        }
        private static async void Run()
        {
            DocumentClient documentClient = new DocumentClient(new Uri(endpointUri),
                authKey);
        }

When the method Run is exectued the Client is instantiated with the local CosmosDB emulator. 


Step 5: Lets start building the features

Next step is to build the features as listed above. Lets add the methods inside the Async method.

Creating Database:

To create a new database programmatically, we make use of CreateDatabaseAsync() or CreateDatabaseIfNotExistsAsync(). When creating the database we pass the database name. Here is the code snippet:

private static async Task<Database> CreateDatabase(DocumentClient documentClient)
        {
            Database database = documentClient.CreateDatabaseQuery().Where(c => c.Id == "courseDatabase").AsEnumerable().FirstOrDefault();
            if (database == null)
            {
                database = await documentClient.CreateDatabaseAsync(new Database()
                {
                    Id = "courseDatabase"
                });
            }
            return database;
     }
When you refresh the URL of local CosmosDB emulator, You should see the database created in your local db emulator as follows,



Creating Collection:

Once the database is created, we can then create a collection. We make use of CreateDocumentCollectionAsync() or CreateDocumentCollectionIfNotExistsAsync()

We will need to provide what is known as the database link (basically the URI at which the db can be reached) and the collection name to the create method. Here is the code snippet:

private static async Task<DocumentCollection> CreateDocumentCollection(DocumentClient documentClient, Database database)

        {
            DocumentCollection documentCollection = documentClient.CreateDocumentCollectionQuery(database.CollectionsLink).Where(c => c.Id == "courseDocumentCollection").AsEnumerable().FirstOrDefault();

            if (documentCollection == null)
            {
                documentCollection = await documentClient.CreateDocumentCollectionAsync(database.SelfLink, new DocumentCollection()
                {
                    Id = "courseDocumentCollection"
                });
            }

            return documentCollection;
        }

Now you should the the Collection for Course is created as follows,





Creating Document : 


After creating the database and collection, we can now create the documents. We make use of CreateDocumentAsync() for this purpose. We will need to pass the URI of the collection under which we want to create the document and the document data itself. In this example we make use of the Course data mode i showed earlier and pass it to the create method. Here is the code snippet: 


private static async Task CreateCourse(DocumentClient documentClient, DocumentCollection documentCollection)
        {
            Course course = new Course()
            {
                CourseId = Guid.NewGuid(),
                Name = "En",
                Teacher = new Teacher()
                {
                    TeacherId = Guid.NewGuid(),
                    FullName = "Scott Hanselman",
                    Age = 44
                },
                Students = new List<Student>()
                {
                    new Student(){
                         FullName = "Trump",
                         StudentId = Guid.NewGuid()
                    }
                },
                Sessions = new List<Session>(){
                    new Session(){
                        SessionId = Guid.NewGuid(),
                        Name = "CosmosDB",
                        MaterialsCount = 10
                    },
                    new Session(){
                        SessionId = Guid.NewGuid(),
                        Name = "Ch1",
                        MaterialsCount = 3
                    }
                }
            };
            Document document = await documentClient.CreateDocumentAsync(documentCollection.DocumentsLink, course);
        }


You should see the document inserted in localdb Emulator as follows.



Querying Document:

Now that we have created a document, we can see how to query it. We can make use of CreateDocumentQuery() method for this purpose. We will need to pass the collection link on which we need to query. We can then build the query as a LINQ expression and the client library does the rest. This is the best part of the client library. It has the ability to translate your LINQ expression to cosmos REST URIs without me having to crack my head in constructing those URIs. Here is the code snippet:

private Course QueryCourse(Guid guid, String dbName, DocumentClient documentClient, string collectionName)
        {
            Course selectedCourse = documentClient.CreateDocumentQuery<Course>(
                             UriFactory.CreateDocumentCollectionUri(dbName, collectionName))
                             .Where(v => v.Name == "CosmosDB")
                             .AsEnumerable()
                             .FirstOrDefault();
            return selectedCourse;
        }


Note that you will need to import System.Linq for the LINQ expression to work.

Deleting Database:

Finally, we can make use of DeleteDatabaseAsync() method to delete the database programmatically. We will need to provide the database link to the delete method. We can use the UriFactory.CreateDatabaseUri() helper method to create the database link. Here is the code snippet:

await documentClient.DeleteDatabaseAsync(UriFactory.CreateDatabaseUri(dbName)); 

Well, those are the main features that Azure CosmosDB client provides and if you are stuck with any of the steps above , you can check out the repository i have added with the samples.

Happy Coding! Lets spread Azure's CosmosDB to the world.