Url Shortner in Golang

FeaturedUrl Shortner in Golang

TLDR; Trying to learn new things I tried writing a URL shortner called shorty. This is a first draft and I am trying to approach it from first principle basis. Trying to break down everything to the simplest component.

I decided to write my own URL shortner and the reason for doing that was to dive a little more into golang and to learn more about systems. I have planned to not only document my learning but also find and point our different ways in which this application can be made scalable, resilient and robust.

A high level idea is to write a server which takes the big url and return me a short url for the same. I have one more requirement where I do want to provide a slug i.e a custom short url path for the same. So for some links like https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed, I want to have a url like url.farhaan.me/linktray which is easy to remember and distribute.

The way I am thinking to implement this is by having two components, I want a CLI interface which talks to my Server. I don’t want a fancy UI for now because I want it to be exclusively be used through terminal. A Client-Server architecture, where my CLI client sends a request to the server with a URL and an optional slug. If a slug is present URL will have that slug in it and if it doesn’t it generates a random string and make the URL small. If you see from a higher level it’s not just a URL shortner but also a URL tagger.

The way a simple url shortner works:

Flow Diagram

A client makes a request to make a given URL short, server takes the URL and stores it to the database, server then generates a random string and maps the URL to the string and returns a URL like url.farhaan.me/<randomstring>.

Now when a client requests to url.farhaan.me/<randomstring>, it goest to the same server, it searches the original URL and redirects the request to a different website.

The slug implementation part is very straightforward, where given a word, I might have to search the database and if it is already present we raise an error but if it isn’t we add it in the database and return back the URL.

One optimization, since it’s just me who is going to use this, I can optimize my database to see if the long URL already exists and if it does then no need to create a new entry. But this should only happen in case of random string and not in case of slugs. Also this is a trade off between reducing the redundancy and latency of a request.

But when it comes to generating a random string, things get a tiny bit complicated. This generation of random strings, decides how many URLs you can store. There are various hashing algorithms that I can use to generate a string I can use md5, base10 or base64. I also need to make sure that it gives a unique hash and not repeated ones.

Unique hash can be maintained using a counter, the count either can be supplied from a different service which can help us to scale the system better or it can be internally generated, I have used database record number for the same.

If you look at this on a system design front. We are using the same Server to take the request and generate the URL and to redirect the request. This can be separated into two services where one service is required to generate the URL and the other just to redirect the URL. This way we increase the availability of the system. If one of the service goes down the other will still function.

The next step is to write and integrate a CLI system to talk to the server and fetch the URL. A client that can be used for an end user. I am also planning to integrate a caching mechanism in this but not something out of the shelf rather write a simple caching system with some cache eviction policy and use it.

Till then I will be waiting for the feedback. Happy Hacking.

I now have a Patreon open so that you folks can support me to do this stuff for longer time and sustain myself too. So feel free to subscribe to me and help me keeping doing this with added benefits.

Link Tray

FeaturedLink Tray

TLDR; Link Tray is a utility we recently wrote to curate links from different places and share it with your friends. The blogpost has technical details and probably some productivity tips.

Link Bubble got my total attention when I got to know about it, I felt it’s a very novel idea, it helps to save time and helps you to curate the websites you visited. So on the whole, and believe me I am downplaying it when I say Link Bubble does two things:

  1. Saves time by pre-opening the pages
  2. Helps you to keep a track of pages you want to visit

It’s a better tab management system, what I felt weird was building a whole browser to do that. Obviously, I am being extremely naive when I am saying it because I don’t know what it takes to build a utility like that.

Now, since they discontinued it for a while and I never got a chance to use it. I thought let me try building something very similar, but my use case was totally different. Generally when I go through blogs or articles, I open the links mentioned in a different tab to come back to them later. This has back bitten me a lot of time because I just get lost in so many links.

I thought if there is a utility which could just capture the links on the fly and then I could quickly go through them looking at the title, it might ease out my job. I bounced off the same idea across to Abhishek and we ended up prototyping LinkTray.

Our first design was highly inspired by facebook messenger but instead of chatheads we have links opened. If you think about it the idea feels very beautiful but the design is “highly” not scalable. For example if you have as many as 10 links opened we had trouble in finding our links of interest which was a beautiful design problems we faced.

We quickly went to the whiteboard and put up a list of requirements, first principles; The ask was simple:

  1. To share multiple links with multiple people with least transitions
  2. To be able to see what you are sharing
  3. To be able to curate links (add/remove/open links)

We took inspiration from an actual Drawer where we flick out a bunch of links and go through them. In a serendipitous moment the design came to us and that’s how link tray looks like the way it looks now.

Link Tray

Link Tray was a technical challenge as well. There is a plethora of things I learnt about the Android ecosystem and application development that I knew existed but never ventured into exploring it.

Link Tray is written in Java, and I was using a very loosely maintained library to get the overlay activity to work. Yes, the floating activity or application that we see is called an overlay activity, this allows the application to be opened over an already running application.

The library that I was using doesn’t have support for Android O and above. To figure that out it took me a few nights 😞 , also because I was hacking on the project during nights 😛 . After reading a lot of GitHub issues I figured out the problem and put in the support for the required operating system.

One of the really exciting features that I explored about Android is Services. I think I might have read most of the blogs out there and all the documentation available and I know that I still don't know enough. I was able to pick enough pointers to make my utility to work.

Just like Uncle Bob says make it work and then make it better. There was a persistent problem, the service needs to keep running in the background for it to work. This was not a functional issue but it was a performance issue for sure and our user of version 1.0 did have a problem with it. People got mislead because there was constant notification that LinkTray is running and it was annoying. This looked like a simple problem on the face but was a monster in the depth.

Architecture of Link Tray

The solution to the problem was simple stop the service when the tray is closed, and start the service when the link is shared back to link tray. Tried, the service did stop but when a new link was shared the application kept crashing. Later I figured out the bound service that is started by the library I am using is setting a bound flag to True but when they are trying to reset this flag , they were doing at the wrong place, this prompted me to write this StackOverflow answer to help people understand the lifecycle of service. Finally after a lot of logs and debugging session I found the issue and fixed it. It was one of the most exciting moment and it help me learn a lot of key concepts.

The other key learning, I got while developing Link Tray was about multi threading, what we are doing here is when a link is shared to link tray, we need the title of the page if it has and favicon for the website. Initially I was doing this on the main UI thread which is not only an anti-pattern but also a usability hazard. It was a network call which blocks the application till it was completed, I learnt how to make a network call on a different thread, and keep the application smooth.

Initially approach was to get a webview to work and we were literally opening the links in a browser and getting the title and favicon out, this was a very heavy process. Because we were literally spawning a browser to get information about links, in the initial design it made sense because we were giving an option to consume the links. Over time our design improved and we came to a point where we don’t give the option to consume but to curate. Hence we opted for web scraping, I used custom headers so that we don’t get caught by robot.txt. And after so much of effort it got to a place where it is stable and it is performing great.

It did take quite some time to reach a point where it is right now, it is full functional and stable. Do give it a go if you haven’t, you can shoot any queries to me.

Link to Link Tray: https://play.google.com/store/apps/details?id=me.farhaan.bubblefeed

Happy Hacking!

Debugging React Native

FeaturedDebugging React Native

Recently I have been trying to dabble with mobile application development. I wanted to do something on a cross platform domain, so mostly I am being lazy where I wanted to write the application once and make it work for both iOS and Android. I had a choice to choose between Flutter which is a comparatively new framework and React Native which has been here for a while. I ended up choosing React Native and thanks to Geeky Ants for Native Base the development became really easy.

The way I am approaching developing my application is, by a Divide and Conquer strategy, where I have divide the UI part of the application into different component and I am developing one component at a time. I am using StoryBook for achieving the same. This is a beautiful utility which you should check out. Here we can visualise each small component and see how it will render.

While developing this application I was constantly facing the issue of asking myself , “Did this function got called?” or “What is the value of this variable here?”. So going back to my older self I would have put a print statement or as in JavaScript put a console.log on it.(Read it like Beyoncé‘s put a ring on it.)

Having done that for sometime, I asked myself; Is there a better way to this . And the answer is “YES!”, there is always a better way we just need to find out. Now enters the hero of our story, “Debugger” there are various operations that I can perform like to put up break points – conditional and non conditional and analyse our code.

Let me quickly walk you through this, so VS Code has a react native plugin that has to be configured, to use in our project. Once that is done we are almost ready to use it. I faced few issues while getting it to work initially and so I thought having some pointers upfront might ease out the development for other people.

Before I go into deeper details, just a preview of how a debugger looks like on ReactNative with VS Code.

Debugger In Action

So let’s get to the meat of it and how to get it to work. First and foremost we need to download and install React Native Tools from VS Code extensions.

Once that is done we are are good to go, on the side bar you can see that there is a Debug button , when you click on it, VS code opens up a configuration file.

launch.json

There are various options you can play around but the one that worked the best for me was Attach to packager. Once the configuration is in, then we need start the packager, mostly it is done by npm start, but VS Code also provide an option in the action bar at the bottom.

Action Bar

Once the the packager is started we need to start our debugger, click on the Debug button on the sidebar and click on the Attach to packager icon. This will start your debugger from VS Code. Till now your action bar should be of blue color showing that the debugger has started but not active yet.

Then on the device where you are deploying the application you need to enable Debug Js and Vola! your debugger will be active. Debugger has helped me a lot. I could inspect each variable and see at what point of time what values it holds.

Or I can step into debugger and trace the flow of the control statement. I can even put a conditional break point and see when a condition is met or not.

Debuggers has helped me a lot to make my development go faster, hope this blog helps the readers too.

Happy Hacking!

Android Services

FeaturedAndroid Services

From past few days I have been dwelling in android to make a utility, an application that I can be used when I am reading and article or when I am researching about something.

The premise lies around on the fact that the application itself doesn’t have a screen but what it plays around is on background activity. So it silently keeps on running and when an interrupt comes it performs an action.

Since I am not very well versed with how to make an android of such kind, I searched and found out about the component which does this and it’s called a Service. This is very similar to the concept of linux services or daemons.

The application I am designing is basically a combination of overlay activity and background services. Hence I wouldn’t say that there will be no user interaction at all but it will be really minimal and user doesn’t have a inherent knowledge about the service.

So it was time to do some more reading on android services.

Service is an application component that can perform long-running operations in the background, and it doesn’t provide a user interface. Another application component can start a service, and it continues to run in the background even if the user switches to another application.

Documentation

This was something that I really wanted, now there are a few caveats to this blogpost that is my understanding and knowledge about android application development. So I would tell you take things with a pinch of salt and let me know if there is anything wrong with my understanding.

When I read more about services I got to know there are 3 kinds of them:

Foreground

This is how you see spotify music work, even when the application is not in the display you can change songs through notification and there is one level of user interaction involved with this.

Background

Services with which user don’t want to know about or interact, like updating a database, fetching some resources etc.

Bound

Bound services are the one which are attach to the user activity, the quickest example I can give is music player, you don’t want the music to stop when you switch application and in the mean time you want to control the music when you switch back to the application.

Mostly people use services so that all the heavy lifting is done in the background. I had a unique case what I wanted is a service that keeps running and observing, when something is changed or when it is poked then react to it.

If you have seen the design of facebook messenger, the chat heads comes to life only when you have a message, this was somewhat the use case.

The biggest thing that I learnt is android doesn’t allow you run a background service without notifying the user. This is a new addition the happened after Android Oreo.

Implementation

There are two kinds of implementation that android provides,

The former as it’s name suggest is used to spawn service and is attached to the main thread. While IntentSrvice is something more peculiar where you can divide the work and do it without actually make your application wait for something. For example suppose you are playing a game and you are in middle of level 1, now an IntentService can be used to spawn to download and keep all the data required for level 2 without affecting your game play.

Another amazing thing about services is that, it is a singleton, that means however time you are going to start a service, you are not going to interact with too many objects, it’s the same class which you are going to talk to.

Conclusion

These are few of the learning that I got about services in android, I didn’t put much code here because most of them is available in the references. I enjoyed my time learning about how services are designed and how they are manged internally in android. Let me know what you think about it.

Till then, Keep Hacking!

References:

https://www.hellsoft.se/how-to-service-on-android—part-2/

https://proandroiddev.com/deep-dive-into-android-services-4830b8c9a09

https://robertohuertas.com/2019/06/29/android_foreground_services/

https://medium.com/@harunwangereka/android-background-services-b5aac6be3f04

Word Embeddings Simplified

FeaturedWord Embeddings Simplified

Recently I have been dwelling with a lot of NLP problems and jargons. The more I read about it the more I find it intriguing and beautiful of how we humans try to transfer this knowledge of a language to machines.

How much ever we try because of our laid back nature we try to use already existing knowledge or existing materials to be used to make machines understand a given language.

But machines as we know it can only understand digits or lets be more precise binary(0s and 1s). When I first laid my hands on NLP this was my first question, how does a machine understand that something is a word or sentence or a character.

I am still a learner in this field(and life 😝) but what I could understand information that we are going to use has to be converted into binary or some kind of a numerical representation for a machine to understand.

There are various ways to “encode” this information into numerical form and that is what is called word embeddings.

What are word embeddings?

Word embedding is the collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per word to a continuous vector space with a much lower dimension.

Wikipedia

In short word embedding is a way to convert a textual information into numerical form so that it can help us analyse it.

Analysis like similarity between words or sentences, understand the context in which a phrase or word is being spoken etc.

How are they formed?

Lets try to convert a given sentence into a numerical form:

A quick brown fox jumps over the lazy dog

How do we convert the above sentence into a numerical form such that our machine or even we can perform operations on it. And its hard to figure out the mathematics of language but we can always try.

So lets try, what we can do is, get all unique words and sort the words in the sentences and then makes a list of them. But then how do we get a numerical representation for it. It’s time for us to visit our long lost friend – Matrix.

Let’s get the words in proper order i.e unique and sorted

Now we will try to convert these words into numerical form using some matrix concepts(mostly representation) so that we can make a word look different from another word.

If you see there are totally 10 words and so we took 10 blocks to represent it. In a more mathematical term each representation is called a vector and the dimension of this vector is 1 x 10. So each word in this universe can be represented by a vector of that dimension and we can now carry operations on it to get our desired result.

Few prominent operations are how similar are two vectors or how different are two vectors. We can dive into that later.

Now the method that we just followed is a very brute force way of doing this and is officially called as One-Hot Encoding or Count Vectorizing.

Why we do this?

Now the way we encoded above words can be really useless because it’s just a representation and it doesn’t have any other idea so we don’t know how two words are related or are they morphologically similar etc.

The prime reason we want to have encoding is to find similar words, gauge the context of the topics etc.

There are various other techniques which actually produce intelligent embeddings that has an idea about what is going on.

As Hunter puts it

When constructing a word embedding space, typically the goal is to capture some sort of relationship in that space, be it meaning, morphology, context, or some other kind of relationship

and a lot of other embeddings like Elmo, USE etc. does a good job at that.

As we go ahead and explore more embeddings you will see it goes on becoming more complex. There are layers of training models introduced etc.

We even have sentence embeddings which are way different from just word embeddings.

Conclusion

This was just a tip of the iceberg or may be not even that but I thought it will be helpful for someone who is starting their exploration because it took time for me to get around this concept. Thanks a lot for reading.

Happy Hacking!

References:

http://hunterheidenreich.com/blog/intro-to-word-embeddings/

https://towardsdatascience.com/word-representation-in-natural-language-processing-part-ii-1aee2094e08a

https://towardsdatascience.com/document-embedding-techniques-fed3e7a6a25d?gi=7c5fcb5695df

The Late End Year Review – 2018

FeaturedThe Late End Year Review – 2018

I know I am really, really late, but better late than never.
This past year has been really formative for me.

In this short personal retrospective post, I am just going to divide my experience into 3 categories, the good, the bad and the ugly best.

The Bad

  1. My father got really sick and I got really scared by the thought of losing him.
  2. I moved on from the first company I joined, because I was getting a bit stifled and yearned to learn and grow more.
  3. My brother got transferred, so I had to live without family for the first time in my life. I had never lived alone before this.
  4. I was not able to take the 3 month sabbatical, I thought I could.
  5. I couldn’t find a stable home and was on the run from one place to another constantly.

The Good

  1. I learnt how to live alone. I learnt how to find peace while being alone. Because of this, I could also explore more books and more importantly I could spend more time by myself figuring out what kind of person I want to become.
  2. I got a job with Clootrack, where people are amazing to work with and there is so much to learn.
  3. I found the chutzpah to quit my job, even thought I didn’t have a back up. In roundabout way, it gave me the strength to take risks in life and the courage to handle its consequences.
  4. Bad times help you discover good friends. I am not trying to boast about it, (but you are 😝– ed) but I am thankful to God that I have an overwhelming number of good friends.
  5. I got asked out in a coffee shop! This has never happened to me before. (BUT YES! THIS HAPPENED!).
  6. I wrote few poems this year, all of them heartfelt.
  7. I gained a measure of financial independence and the experience of how to handle things when everything is going south.
  8. I finally wrote a project and released it. I was fortunate enough to get few contributors.
  9. I am more aware now, and have stopped taking people and time for granted.
  10. Started Dosa Culture.
  11. Applied to more conferences to give a talk.

The Best

  1. I read more this year and got to learn about a lot of things from Feynman to Krebs. I explored fiction, non fiction, self help, and humour.
  2. I went to Varanasi (home) more than I ever did in the last five years of my life. I spent lots of time with my parents. I am planning to do it more.
  3. Went on a holiday to Pondicherry. I went for a holiday for the first time, with the money I saved up for the trip. I saw the sunrise of 1st January sitting on Rock beach.
  4. Got rejected at all the conferences I applied for. No matter. It motivates me even more, to try harder, to dance on the edge, to learn more, do more. It helps me strive for greatness, while also being a good reality check.
  5. Spent more time on working on hobby projects and contributing to open source.
  6. Got a chance to be a visiting faculty, and teach programming in college.
  7. Lived more! Learnt More! Loved More!

I feel I might be missing quite a few things in the lists, but these are the few, that helped me grow as a person. They impacted me deeply and changed my way of looking at life.

I hope the coming year brings better experiences and more learning!

Until then,
Live Long and Prosper! (so cheesy – ed)

6 Bags and A Carton

This is not a technical post; this is something that I have been going through, in life right now. A few months ago, when I left my first job (another time, another post 😉 ), I had a plan. I wanted to take few months off and work on my technical knowledge and write amazing software and get a lot of learning out of my little sabbatical.

But I was not able to do that for a few reasons, primo being I had to move homes in Bangalore because my brother got transferred, so the savings that I had set aside wouldn’t be enough. This was not the end. When it rains, it pours apparently. My dad got super sick, he had a growth near his kidney which the doctors diagnosed as cancer. I got really scared with the situation I was going through. The thing about your parents is that no matter how much you fight with them or how much they “control” you; at the end of the day the thought of losing them can scare the hell out of you. For me, they are my biggest support system so I was not scared, I was terrified.

I gave it a really deep thought and took a call. I needed to find a job. The sabbatical could wait. I started applying to companies and talking to people if they needed extra hand at work. One piece of advice – never leave a job unless you have another in hand. Luckily, I had my small pot of gold, savings, so even in this phase I was sustaining myself. Yes, savings are real and you should have a sufficient amount at any given point of your life. This helps you to take the hard decisions and also to think independently (what Jason calls F*ck you money).

It still feels like a nightmare to me. I use to feel that I will wake up and it will all be over. Reality check; it wasn’t a dream so I have to live with it and make efforts to overcome this situation.

Taking up a job for me was important for two reasons,

  1. I have to sustain myself
  2. I need to have a back up in case my dad needs something (I also have super amazing siblings who were doing the same)

I realised one thing about prayer and God; yes, I believe in God, and I don’t know if prayer works but you definitely get the strength to face your problems and the unknown. I use to call my dad regularly asking how he was doing and some days he could not speak all that much and he use to talk in his weak tone. I use to cry. I was in so much pain although it was not physical or visible. And then, I would cry again.

But tough times teach you a lot, it shows you real friends, it shows you the people you care for and as Calvin’s dad would have said, “It build character!”. I have been through bad times before and the thing about time is , “It changes!”. I knew someday this bad time I am going through will change. Either the agony I am going through will reduce or I will get used to it.

So as I was giving interviews within a month of me moving on from my old job, I was offered one at Clootrack. I like the people who interviewed me and I like that ideas they have been working on. But I have seen people change and I have gone through a bad experiences and at no point of time did I want to repeat past mistakes, so I did a thorough background check before I said yes to them. I got a really good response so here I am working with them.

The accommodation problem that I had was my brother was shifting out of his quarters  and I used to live with him. Well, I helped him pack and I still remember the time when I was bidding farewell to him and my sister-in-law. I had tears in my eyes and after my goodbyes, the moment I stepped in the house I could feel the emptiness and I cried the whole night.  I  could stay at the old place for a week, not more. At this point I can’t thank Abhinav enough that he came as  a support I needed. He graciously let me live with him as long as  I wanted to. Apparently he needed help, paying his bills :P.  This bugger would never accept the fact, he helped me. When dad’s condition was getting bad he gave me really solid moral support. I had also shared my situation with Jason, Abraar, Kushal and Sayan. I received a good amount of moral support from each one of them, specially Jason. I use to tell him everything and he would just calm me down and talk me through it.

So when I shifted to Abhinav’s place all I had was 6 bags and a carton. My whole life was 6 bags and a carton. My office was a 2 hour bus ride one way and another 2 hours to come back. But I didn’t have any problems with this arrangement because this was the least of my problems. I literally use to live out of my bags and I wasn’t sure this arrangement would last long. I had some really amazing moments with Abhinav, I enjoyed our ups and downs and those little fights and leg pulling.

Well, my dad is still not in the best of his health, but he is doing better now. I visit my family more frequently now and yes call them regularly with a miss. I realised the value of health after seeing my dad. I went home after a month of joining Clootrack and stayed with him for a whole month and worked remotely, we visited few doctors and they said he is doing better. After coming back I realised I was not getting any time for myself so I shifted to a NestAway near my office. Although I feel I’ve gotten used to the agony, you never know what life has in store for you next.
It feels much better now, though.

I thank God for giving me strength and my friends and family for supporting me in a lot of different ways.

With Courage in my Heart,
And Faith over Head

 

 

File Indexing In Golang

File Indexing In Golang

I have been working on a pet project to write a File Indexer, which is a utility that helps me to search a directory for a given word or phrase.

The motivation behind to build this utility was so that we could search the chat log files for dgplug. We have a lot of online classes and guest sessions and at times, we just remember the name or a phrase used in the class, backtracking the files using these phrases aren’t possible as of now. I thought I will give a stab at this issue and since I am trying to learn golang I used it to implement my solution. It took me a span of two weeks where I spent time to upskill certain aspects and also to come up with a clean solution.

Exploration

This started with me exploring similar solutions, because why not? It is always better to improve an existing solution than to write your own. I didn’t find any which suited our need though so I ended up writing my own. The exploration led me to discover a few  libraries that proved useful. I found fulltext and Bleve.

I found bleve to have better documentation and some really beautiful thought behind the library. Really minimal yet effective. At the end of it all, I was sure I was going to use it.

Working On the Solution

After all the exploration I tried to break the problem into smaller pieces and then go about solving each one of them. So the first one was to understand how bleve worked. I found out that bleve creates an index first; for which we need to give it the list of files. The index is basically a map structure behind the scenes, where you give it the id and content to be indexed. So what could be a unique constraint for a file in a filesystem? The path of the file! I used it as the id to my structure and the content of my file as the value.

After figuring this out, I wrote a function which takes the directory as the argument and gives back the path of each file as well as its contents. After a few iterative. improvements it diverged into two functions; one responsible to get the path of all the files and the other to just read the file and get the content out.

func fileNameContentMap() []FileIndexer {
	var ROOTPATH = config.RootDirectory
	var files []string
	var filesIndex FileIndexer
	var fileIndexer []FileIndexer

	err := filepath.Walk(ROOTPATH, func(path string, info os.FileInfo, err error) error {
		if !info.IsDir() {
			files = append(files, path)
		}
		return nil
	})
	checkerr(err)
	for _, filename := range files {
		content := getContent(filename)
		filesIndex = FileIndexer{Filename: filename, FileContent: content}
		fileIndexer = append(fileIndexer, filesIndex)
	}
	return fileIndexer
}

This forms a struct which stores the name of the file and the content of the file. And since I can have many files I need to have a array of said struct. This is how a simple data structure evolves into a complex one.

Now I have the utility of getting all files, getting content of the file and making an index.

This leads us to the next crucial step.

How Do I Search?

Now that I’ve prepped my data the next logical step was to retrieve the searched results. The way we search something is by passing a query so I duck-typed a function which accepts a string and then went on a spree of documentation look up to find out how do I search in bleve. I found a simple implementation which returns the id of the file which is the path and match score.


&nbsp;func searchResults(indexFilename string, searchWord string) *bleve.SearchResult {
	index, _ := bleve.Open(indexFilename)
	defer index.Close()
	query := bleve.NewQueryStringQuery(searchWord)
	searchRequest := bleve.NewSearchRequest(query)
	searchResult, _ := index.Search(searchRequest)
	return searchResult
}

This function opens the index and search for the term and returns back the information.

Let’s Serve It

After all that is done I need to have a service which does this on demand so I wrote a simple API server which has two endpoints index and search.  The way mux works is you give the endpoint to the handler and the function to be mapped with it. I had to restructure the code in order to make this work. I faced a really crazy bug which when I narrowed it down, came to a point of a memory leak and yes, it was because I left the file read stream open, so remember when you Open always defer Close.

I used Postman to heavily test it and it was returning good responses. A dummy response looks like this:

&nbsp;[{"index":"irclogs.bleve","id":"logs/some/hey.txt","score":0.6912244671221862,"sort":["_score"]}]

Missing Parts?

The missing part was I didn’t use any dependency manager which Kushal pointed out to me, so I landed up using dep to do this for me. The next one was one of my favourite  problems of the project and that was how to auto-index a file. Suppose my service is running and I added one more file to the directory, then this file’s content wouldn’t come up in the search because the indexer hasn’t run on it yet. This was a fascinating  problem and I tried to approach it from many different angles. First I thought I would re-run the service every time I add a file but that’s not a graceful solution. Then I thought I would write a cron job which would ping /index at regular intervals and yet again that struck me as inelegant. Finally I wondered if I could detect changes in a file. This led me to explore gin, modd and fresh.

Gin was not very compatible with mux so didn’t use it, modd was really nice but I needed to kill the server to restart it since two services cannot run on a single port and every time I kill that service I kill the modd daemon too so that possibility also got ruled out.

Finally the best solution was fresh although I had to write a custom config file to suit the requirement, this approach still has issues with nested repository indexing which I am thinking how to figure out.

What’s Next?

This project is yet to be containerised and there are missing test cases so I would be working on them, as and when I get time.

I have learnt a lot of new things about the filesystem and how it works, because of this project. This little project also helped me appreciate a lot of golang concepts and made me realise the power of static typing.

If you are interested you are welcome to contribute to file-indexer. Feel free to ping me.

Till then, Happy Hacking!

 

Template Method Design Pattern

Template Method Design Pattern

This is a continuation of the design pattern series.

I had blogged about Singleton once, when I was using it very frequently. This blog post is about the use of the Template Design Pattern. So let’s discuss the pattern and then we can dive into the code and its implementation and see a couple of use cases.

The Template Method Design Pattern is a actually a pattern to follow when there are a series of steps, which need to be followed in a particular order. Well, the next question that arises is, “Isn’t every program a series of steps that has to be followed in a particular order?”

The answer is Yes!

This pattern diverges when it becomes a series of functions that has to be executed in the given order. As the name suggests it is a Template Method Design pattern, with stress on the word method, because that is what makes it a different ball game all together.

Let’s understand this with an example of Eating in a Buffet. Most of us have follow a set of similar specific steps, when eating at a Buffet. We all go for the starters first, followed by main course and then finally, dessert. (Unless it is Barbeque Nation then it’s starters, starters and starters :))

So this is kind of a template for everyone Starters --> Main course --> Desserts.

Keep in mind that content in each category can be different depending on the person but the order doesn’t change which gives a way to have a template in the code. The primary use of any design pattern is to reduce duplicate code or solve a specific problem. Here this concept solves the problem of code duplication.

The concept of Template Method Design Pattern depends on, or rather  is very tightly coupled with Abstract Classes. Abstract Classes themselves are a template for derived classes to follow but Template Design Pattern takes it one notch higher, where you have a template in a template. Here’s an example of a BuffetHogger class.

from abc import ABC, abstractmethod

class BuffetHogger(ABC):

    @abstractmethod
    def starter_hogging(self):
        pass

    @abstractmethod
    def main_course_hogging(self):
        pass

    @abstractmethod
    def dessert_hogging(self):
        pass

    def template_hogging(self):
        self.starter_hogging()
        self.main_course_hogging()
        self.dessert_hogging()

So if you see here the starter_hogging, main_course_hogging and dessert_hogging are abstract class that means base class has to implement it while template_hogging uses these methods and will be same for all base class.

Let’s have a Farhaan class who is a BuffetHogger and see how it goes.

class Farhaan(BuffetHogger):
    def starter_hogging(self):
        print("Eat Chicken Tikka")
        print("Eat Kalmi Kebab")

    def __call__(self):
        self.template_hogging()

    def main_course_hogging(self):
        print("Eat Biryani")

    def dessert_hogging(self):
        print("Eat Phirni")
Now you can spawn as many  BuffetHogger  classes as you want, and they’ll all have the same way of hogging. That’s how we solve the problem of code duplication
Hope this post inspires you to use this pattern in your code too.
Happy Hacking!

Benchmarking MongoDB in a container

The database layer for an application is one of the most crucial part because believe it or not it effects the performance of your application, now with micro-services getting the attention I was just wondering if having a database container will make a difference.

As we have popularly seen most of the containers used are stateless containers that means that they don’t retain the data they generate but there is a way to have stateful containers and that is by mounting a host volume in the container. Having said this there could be an issue with the latency in the database request, I wanted to measure how much will this latency be and what difference will it make if the installation is done natively verses if the installation is done in a container.

I am going to run a simple benchmarking scheme I will make 200 insert request that is write request keeping all other factors constant and will plot the time taken for these request and see what comes out of it.

I borrowed a quick script to do the same from this blog. The script is simple it just uses pymongo the python MongoDB driver to connect to the database and make 200 entries in a random database.


import time
import pymongo
m = pymongo.MongoClient()

doc = {'a': 1, 'b': 'hat'}

i = 0

while (i < 200):

start = time.time()
m.tests.insertTest.insert(doc, manipulate=False, w=1)
end = time.time()

executionTime = (end - start) * 1000 # Convert to ms

print executionTime

i = i + 1

So I went to install MongoDB natively first I ran the above script twice and took the second result into consideration. Once I did that I plotted the graph with value against the number of request. The first request takes time because it requires to make connection and all the over head and the plot I got looked like this.

 

Native
MongoDb Native Time taken in ms v/s Number of request

The graph shows that the first request took about 6 ms but the consecutive requests took way lesser time.

Now it was time I try the same to do it in a container so I did a docker pull mongo and then I mounted a local volume in the container and started the container by

docker run --name some-mongo -v /Users/farhaanbukhsh/mongo-bench/db:/data/db -d mongo

This mounts the volume I specified to /data/db in the container then I did a docker cp of the script and installed the dependencies and ran the script again twice so that file creation doesn’t manipulate the time.

To my surprise the first request took about 4ms but subsequent requests took a lot of time.

Containered
MongoDB running in a container(Time in ms v/s Number of Requests)

 

And when I compared them the time time difference for each write or the latency for each write operation was ​considerable.

MongoDB bench mark
Comparison between Native and Containered MongoDB

I had this thought that there will be difference in time and performance but never thought that it would be this huge, now I am wondering what is the solution to this performance issue, can we reach a point where the containered performance will be as good as native.

Let me know what do you think about it.

Happy Hacking!