Showing posts with label google. Show all posts
Showing posts with label google. Show all posts

Friday, September 8, 2017

Into the world of stackoverflow - Tips and tricks

It's been so long since i wanted to write a blog on how to contribute on stackoverflow. Recently me and one of my colleague had a small session on the same topic and i decided to share the experience in this post and this is a good opportunity to address some misconceptions about Stack Overflow.

let's dive into the world of stack overflow.

Programming became a lot easier these days when you could just type a question into Google and it would find that someone asked the same question in stack overflow. Stack overflow offers helping hand to all developers out there. When it was started in 2008 by
Jeff Atwood and Joel Spolsky one of the aim was to have a repository of great questions and answers. It revolutionized the way of quesitioning and answering should be. The main reason for success of stack overflow is accountability. It's managed by the community and community keeps the environment. For example, if you post some bad video on youtube it still spreads over the world and even if you want to remove it it takes number of days and you have to go through lot of process. In case of stackoverflow there is always a guarantee that good content is always delivered and maintained.
It has multiple features embedded. It has the various concepts which are similar to,
Wikipedia - Answers/questions are editable by anyone once you reach level of repuation
Digg/Reddit - It has the same ranking system in the form of reputation so that the best content rises to the top
Forum - Similar to any kind of forum, it has the feature to comment on someone's question or answer.
Blog - Blogs can be embeded as links inside the answers so that you could get more visitors to your blog.
There have been around 7 million active programmers ask and answer the queries related to programming on stack overflow. It is a better reflection of what you do professionally. For any programmer answering and asking questions on stack overflow is a great learning method.

Lets see why do you want to contribute to the community via stack overflow. As a programmer you have the responsibility to give something back to the community. You got to learn from everyone everyday which makes you a better programmer. How do you make you a better with the help of stack overflow
(i) You read awesome code by others
(ii) You get the feedback from other awesome developers on your code
(iii) Competition factor (reputation and ranking) which automatically drives you to learn.
Being an awesome programmer will automatically make people to have an eye on you.

How to start if you are new to stackoverflow?

Starting with stack overflow is once  again a bit of hard task which is similar to owning a startup where reputation that you earn is the investment. Initially it is very hard to find questions you can answer  and you can't ask clarification without comment rights. 

Tips:

(i)Set up a good but short list of Interesting and Ignored tags. :
It is one simple tactic that you could follow to start answering questions, say if you are interested in answering angular questions, you can filter the questions with the tag angular and it keeps getting updated in the time line.



(ii)Try answering at a time of day or day of week when there are fewer users on Stack Overflow and presumably less competition for answering questions. 

According to the following graph, It appears that  heaviest users on stack overflow are from North America as seen here so the lightest times are when North Americans are sleeping. Try answering the questions at that time. 






















 (iii) Earn your initial reputation by asking question:
One simplest way to earn your initial reputation by asking question and accepting the answers which would give you a reputation of 2 and if the question is really good you will be earning a reputation of minimum 15. One common mistake the newbies tends to do is not accepting answers. Always make sure to mark the correct answers.







(iv)Rubber duck debugging - Asking question and answering yourself is one of the best way to start your initial account on stack overflow.

Lets see, How to ask good questions?

There is already FAQ section, but i would like to share my experiences on how to ask a good question.

It should be precise - The questions should not be too broad or too small. It should be very precise. A good question for example,

i tried x and here is my code it does not work and i cant figure out why, here is a piece of code you can reproduce the problem which makes a great question.
Proof-read before posting a question - Do a little bit of research on the question that you are going to post if it already exists in the site and be clear on it.

Include all the relevant tags: Always include the relevant tags with the questions which makes to get answers easily and to filter.

 Don't post a question and run away - One of the common mistakes i see with the new users is that there is no response if there is any further clarification, it is always better to wait for the first interaction which makes answers to come easily.  

 Format Question - Always format your question with the for matter so it is easy for others to get a clear idea.

Next, we will see on How to Write a better answer for the question?

Writing answers is the best way to earn your quick reputation on stack overflow. 

(i)Minimum viable answer - To earn your reputation fast, start with a minimum viable answer it should be working and properly formatted.

(ii) Iterate it : Always try to iterate the answer by adding step by step instruction , explain what does the code do. Also attach a working example code snippet with high level description of concepts. You can also use some of the following tools to attach the demo with the answer. Best way is to use the existing code editor to show the demo or proposed solution. I will be writing a detailed blog on the same. 
      
(iii) Give credits to the author: Whenever taking an answer from blog or an existing stack overflow answers, give credits to the author by adding the link. Always include the content inside the link rather than just adding the link, which would stay forever even if the link is removed later.

Lets see about earning reputation and badges 

Reputation is rough measure of how much community trusts you, communication skills, quality of your questions and answers. The more reputation you earn the more the privileges you get. 

Up votes and Down votes:


If you want to thank someone up vote, mark as answer. If you see good questions that really helps others up vote them. You can read more about reputaiton and how it works. Always try to upvote the best answers so that it goes to the top.

Always take down votes in a sportive manner, down votes are not for demotivating or stopping you from answering, it's a hint for you to make the answer/question better and don't take it personally.

You should not specicically try for getting the badges, because it comes with the reputation which you gain from your experience. Even though there are some badges which you can try intentionally.

Self learner - Answering your own question
Autobigographer - When you fill all your details in your profile. Here is an example.
Analytical - Visit all fields of site FAQ
Critic, Supporter , Editor - when you first down vote up vote and do your first edit
student or teacher - Simply answer and get 1 up vote
commentator - After you leave 10 comments
promoter and investor - when you offer some reputation to other users

Best practices to do: 

Don't be in a hurry

Onne suggestion is to start in the list at between 6 hours to a day old questions if you are looking for particular questions on specific technology using filters. Questions that are old, that shouldn't be answered, have likely already been downvoted. So that can help you filter what to answer until you get more experience.

Pick questions that will stretch you.
In general you will not get as much reputation on older unanswered questions, but you can get practice answering questions, and since there is little rush to answer, you can pick a question that will take more research, and you have the time to go figure it out.

Don't answer poor-quality questions.
In general Before answering a question, do a quick search on the title of the question, and then using other related search terms that might turn up an identical question. Too often, a question is asked which has already been answered before, sometimes many times before. Vote to close or flag such questions using the "duplicate" reason, rather than posting yet another answer.

Make sure the question is clear and unambiguous, concisely stated, and contains a useful code example (most questions really should have a full Minimal, Complete, and Verifiable code example). If you have enough reputation, edit the question to improve it so that it meets those standards. If not down vote it.

It's always helpful to post a comment under the question to encourage the question's author to improve their question, explaining precisely how they can do so. Sometimes, this can be all it takes to help a question poster get their question on the right track.If this doesn't help, then you can flag posts that need to be closed, an ability you unlock at 15 rep. This is a great way for low rep users to help get questions closed. For example, if you find a duplicate, even if you can't vote to close you can still flag the question. This will leave a comment linking to the other post and will put the question in the Close Vote review queue for others to see.If you don't have enough reputation to post a comment, focus on other activities that will garner you the necessary reputation points (for example, answering other, high quality questions or editing posts that could use some help).

Things To Avoid:


We have seen how to do things in the right way, In eager to gain more reputation lets see what are the things that you should not do. Here are a few things that could quickly get you into hot water, but are common mistakes:


Do not undo useful edits.
You would be surprised how many users get angry when their posts are edited! If you don't agree with an edit, you can undo it.However, if someone edits your post to completely change the wording, change your coding style, or otherwise change what your answer says, you are more than welcome to undo that edit.

Do not post a comment in the answer box.
This may seem self-explanatory, but a lot of people do this. Writing any form of "I don't have the rep to comment, so I'm posting this as an answer" does not excuse this act.
Long story short, just don't do it. If you don't have the rep to do something, then don't try to workaround it please. Your post will be deleted at best, and you'll have angry users to deal with at worst.

Don't post just a link to answer a question.
Stack Overflow is meant to be a high-quality repository of problems and their solutions. To accomplish this, we like to have posts be as self-contained as possible.If you link to a tool/plugin/library that will solve a user's problems, great! But please, explain why or how that tool helps. Even better, show how, if possible, to use the tool to solve the problem. A brief code snippet showing how to use a library, for example, or the function the user needs to call and how to incorporate it takes a low-quality or average answer and helps make it a good answer.If you're linking instead to documentation or a blog or such to help explain something, quote the relevant part of the page and explain in your own words how it answers the question. If the relevant part is the entire linked page, summarize it as best you can and explain how this helps.

Don't plagiarize.
To meet the last point, you might decide to just copy and paste large parts of pages to your answer. Don't do this! If the section you're copying is large, try to summarize. Always provide a source to the page and give proper attribution. Put anything you quote this way in a blockquote. Hit the quote key on the toolbar, or insert a > before each line of the quote. If you are quoting another answer on Stack Overflow, check if the questions might be a duplicate of each other. If yes, flag one of them as duplicate of the better one.

Don't post images of code or text.
This might also seem like a no-brainer, but a lot of people do this.Posting images of your code, error message, console output, etc. make it harder for your post to be found through searches, or for readers to paste what you have presented into their editor window. It also makes things much more difficult for users with screen readers.If you can't copy and paste, it's far more preferable to hand-type anything you can. If an error is long, try to post the most relevant part of the error and not the whole thing.

Don't try to be the fastest gun in the west
A well thought-out answer that takes longer to write is better than "Try this" followed by a code dump. Posting an incomplete answer so it gets seen first can lead to down votes from users who don't find it useful, even if you intend to improve it with later edits.


What is #SOAReadyToHelp?

Most of the high reputation profiles in Stack overflow will have the above tagline.It was a contest from 2015 to celebrate the 10 million questions in Stack Overflow.Users were asked to share the experience about Stack Overflow in Twitter with the #SOreadytohelphashtag. So if you are new to stack overflow add it in your profile.

Finally here is the video on the session



I will be writing a separate blog on advantages of being an active user on stack overflow and its benefits. Well those are some of the tricks that you can take from this blog to gain more reputation on stack overflow and more importantly you got to learn and share your experience and contribute to the community. Happy contributing!





Sunday, December 18, 2016

Am I really a developer or just a nethead?

It has been 6 years since i entered into programming field, and 18 years since i started using a computer, Everyone things i am a computer geek. Some times in my mind sounds come that Is that I am really a developer or just a good nethead?.

It's because there has not been a single day i coded without using Google search and Stack Overflow.


My Experience

When I was 15, I wrote my first program.  That was a long time ago now.  And it was in pascal language. During my university days i had more interest on gaming and animation rather than programming. When i got my first job, i struggled to code in c# with the dot net platform  during the initial days. With the help of google and stack overflow, now i would rate myself 8 on c# and i have experience of various open source technologies. But still i felt i was a better googler , not a good programmer. 

What made me to think I am a really bad programmer?
(i) Choosing workarounds over doing the right thing.
(ii)Used Ctr+c and Ctr+v more than normal keys
(iii)When things went wrong, i asked who is at fault rather than what the problem was. 


Mistakes to avoid to become a better programmer
In the year 2016, i started to avoid above practices and i would say programming is the first step to solving problems using technology. My tips that i followed during the year to be a better programmer as follows,



  1. Every day find a small challenge that can  be done in an hour.
  2. Read code. There is a plethora of freely available code for applications. There are tons of free projects by others on github.
  3. Make small projects to build experience.  Make it an open source project and if you can encourage collaboration if your project is compelling enough.
  4. Try programming for a day without googling. continue it for two days, maybe a week. See how it feels.
  5. Go to Meetups, Workshops, meet with others who feel the same way you do about technology.


anyone can become a good developer if he/she is passioned about it and practice a lot, preferably daily."In order to remain at the same level you have to spend at least two hours daily programming.". There will be many programmers out there who would think the same! What do you think?



Thursday, September 1, 2016

Digging into BigData with Google's BigQuery

Well, i was one of the speaker at Colombo Big Data Meetup which was held yesterday and i spoke about Google's bigquery. Hence i have decided to write a blog on that so that you could get benefited if you are a BigData Fan.

What is Big Data?


There are so many definitions for Big Data , let me explain what does it really mean? In the near feature, every object on this earth will be generating data including our body.We have been exposed to so much information everyday.In vast ocean of data, complete picture of where  we live where we go and what we say, its all been recorded and stored forever.More data allows us to see new , better different things.Data in the recent times have changed from stationary and static to fluid and dynamic.we rely a lot on data and thatch is  major part of any business.we live in a very exciting world  today, a world where technology is advancing at a staggering pace, a world data is exploding, tons of data being generated. 10 years before we were measuring data in mega bytes, today we are talking about data which is in petabyte size, may be in few years we are going to reach zetabyte era, that means the end of English alphabets.Does it means the end of Big Data? .No . If you have shared a photo or post or a tweet on any social media,You are one of them who is generating data, and you are doing it very rapidly.



More than 100 thousand tweets in 60 seconds are generated , more than 7 million posts have been posted on Facebook, before you read this sentence.So the data is generated faster  than you could ever think before.Big data and analysis has exploded recently but there is a barrier. That barrier is indeed it needs lot of money resources and time to setup the infrastructure.Also it needs skillful people to make it all happen. so google solves all these big query.Big query,One of the products of google cloud platform that allows us to easily work with big data.It is google's fully managed data analysis service offering in the cloud. It enables super fast analysis.Easily store and analyse big data in google infrastructure.

Lets get familiar with the components.
 Projects are going to be the top level item inside the google cloud platform. Project contains, users authentication billing information and that is where data sets are going to live.
 Data set is really a container for tables.Access controllers cannot be done on tables so that they are don  through projects and data sets.Project contains data sets, data sets contains tables.
 Tables where the data lives
 Jobs are going to be asynchronous process that run on the background to load, export and to execute large queries.




Lets see How to use google cloud platform for big data solution.Architecture is divided in to two workflows named data workflow and visualization workflow.We need to get our source data into big query using any ETL tool and pipe into google cloud storage. Extract it from source and de normalize it,Biquery likes less joins the better .we can use hadoop clusters running on computers to do many pre processing and transforming data.once its in bigquery its all about visualization. Most of the use cases are log analysis which is used to analyse application behavior and user behavior in order to improve the system.Retail forecast - the more data, business has the more accurately they can predict product sales for the next month,that allows them to plan better. Lets see how we can use big query to analyse lots of data in very short time they handle the infrastructure and we can just simply focus on getting our data and analyse it.



Google handles Big Data every second of every day to provide services like Search, YouTube, Gmail and Google Docs.Can you imagine how Google handles this kind of Big Data during daily operations? How they are doing it?

As an example, let’s consider the following SQL query, which requests the Wikipedia® content titles that includes numeric characters in it:

select count(*) from publicdata:samples.wikipedia where REGEXP_MATCH
(title, ‘[0-9]*’) AND wp_namespace = 0;

Notice the following:
• This “wikipedia” table holds all the change history records on Wikipedia’s article content and consists of 314 millions of rows – that’s 35.7GB.

• The expression REGEXP_MATCH(title, ‘[0-9]+’) means it executes a regular expression matching on title of each change history record to extract rows that includes numeric characters in its title (e.g. “United States presidential election, 2015”).
• Most importantly, note that there was no index or any pre-aggregated values
for this table prepared in advance.

Dremel can even execute a complex regular expression text matching on ahuge logging table that consists of about 35 billion rows and 20 TB, in merely tens of seconds. This is the power of Dremel; it has super high scalability and most of the time it returns results within seconds or tens of seconds no matter how big the queried data set is.

Two core technologies which gives Dremel this performance:

1. Columnar Storage. Data is stored in a columnar storage fashion which
makes possible to achieve very high compression ratio and scan throughput.
2. Tree Architecture is used for dispatching queries and aggregating results
across thousands of machines in a few seconds.

Columnar Storage
Dremel stores data in its columnar storage, which means it separates a record into column values and stores each value on different storage volume, whereas
traditional databases normally store the whole record on one volume.

• Traffic minimization. Only required column values on each query are scanned and transferred on query execution. For example, a query “SELECT top(title) FROM foo” would access the title column values only. In case of the Wikipedia table example, the query would scan only 9.13GB out of 35.7GB.
• Higher compression ratio. One study  reports that columnar storage can  achieve a compression ratio of 1:10, whereas ordinary row-based storage can compress at roughly 1:3. Because each column would have similar values, especially if the cardinality of the column (variation of possible column values) is low, it’s easier to gain higher compression ratios than row-based storage. Columnar storage has the disadvantage of not working efficiently when updating existing records. In the case of Dremel, it simply doesn’t support any update operations.

Tree Architecture
One of the challenges Google had in designing Dremel was how to dispatch queries and collect results across tens of thousands of machines in a matter of seconds. The challenge was resolved by using the Tree architecture. The architecture forms a massively parallel distributed tree for pushing down a query to the tree and then aggregating the results from the leaves at a blazingly fast speed.


The tree architecture also enables multiple queries to run at once within the tree, which lets
different users share the same hardware.You might have heard of hadoop map reduce mechanism,
so what is the difference between mad reduce and bigquery


Bigquery can be integrated in to applications in so many ways, Following are the integrations supported by bigquery,

Rest API (SDK) 
  1. Google Spreadsheet
  2. Web application


Interfaces for query:
  1. Command Line Tool
  2. Bigquery UI


Connectors for excel

Tools for Big Data Solution:
As mentioned in the above architecture following tools can be used for managing the data ingest and visualization.

Tableau,BIME and DigIn for analysing and creating visualizations for various insights. Talend and SQLStream for the ingestion of data into bigquery from various data sources.



Nothing comes free, since google handles the infrastructure there is bit of a cost involved and the pricing goes as below.

Once you have decided to use bigquery there are certain things you need to know before using for optimizations and less cost.

Do not use queries that contains Select * , which is going to execute entire dataset and hence it will result in a high cost.
Since bigquery stores values in nested fields it is always better to use repeated fields.
Store in multiple tables as possible since it is recommended not to have JOINS
Bigquery also supports extensions such as ebq and dry run to encrypt the data and for executing the query to actually check how much resources that actual query is going to consume, which makes lot of developers and data analysts job easy.

I will be writing two separate blogs in the coming days on how to integrate with Bigquery and How to ingest the data into bigquery.

You can find the slides of the presentation from here