Vipul Limbachiya

My Personal Blogs

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Hi Friends,

You might be aware of FriendFeed, An amazing application helps to get updated with your friends activities from many resources(services) like digg, twitter, google reader, GTalk Status, Flickr, Picasa and many more… (around 41 services are currently supported).

On very first impression I liked it and now it has been my daily essential tool. It has many features.

Today on mashabale I found a post regarding Greasemonkey scripts for FriendFeed, So I want to share it with you.

Please refer the post “7 Essential Greasemonkey Scripts For FriendFeed”.

-
Vipul Limbachiya

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Hi Friends,

I had an EKT(Engineering Knowledge Test) last Sunday (27/April), the test was not much tough. But before this test I was not knowing about anything about the exam the paper style, syllabus, etc..

There are lots of people like me who wants to know about the exam before facing it so we can prepare or at least see the topics covered in exam, I was not able to find any resource regarding this on net so I thought let me write this post so it might be helpful to the peoples who really want to get prepared for EKT.

What I am going to write below is only from my 1st exam experience of EKT, so scheme or question style may differ for upcoming exams but it helps at least to understand an EKT experience.

There were two papers for EKT :

- No negative marking
- Total 75 marks
- 60 mins time duration
- 1st paper for general engineering knowledge and,
- 2nd is for special stream you’ve selected on form filling time

General Engineering Test Paper :

This one needs good amount of preparation, It has 40 questions and questions were from the knowledge we got from our engineering.

Area of our study covered in this paper was

- Chemistry (Basic fundas of Atom, Element, Molecule, atomic structure, etc and basic rules)
- Maths (Probability,Trigonometric functions and its rules, etc)
- General Aptitude
- Few questions were for basic physics (like satellites,gravity, pressure etc..).
- Few questions were on Electrical Engineering.
- Few were on Strength of material.

Specialist Paper :

This paper had 35 questions.

For this paper there were 5 options
- Electrical
- Aeronautical
- Electronics & Communication
- Electrical, Electronic and Instruments and
- Computer

My test was for computer, this paper was easy but questions were so old like paper was made in 19s.

Questions were based on

- History of computers
- Basic electric elements used in computer
- Storage and input devices
- Peripherals
- SQL
- etc.

So this is the basic information of the test i’ve given, I am waiting for the results.

Feel free to comment if you’ve any questions.


Vipul Limbachiya

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Spiders are also known as robots, ant, crawler, bot, worm and automated indexer, spider is the program which runs in a methodical and automated manner for browsing(and capturing) data from World
Wide Web.

Few Definitions:

Seeds: Spider starts with the list of URLs to visit this list is called as “Seeds”

Crawl Frontiers: After visiting Seed URL spider extracts all links in visited page and add them to list of URLs to visit in future, this list is called as “Crawl Frontier”

Process:

Steps of Creating Spider which generates data from web-pages:

  1. Fetch HTML from live webpage
  2. Store captured data in local file-system
  3. Parse stored HTML
  4. Gather required Data from HTML

Step 1: Fetch HTML from live webpage

After having Seeds the spider will start fetching pure HTML from available (uncrawled) URLs.

This process is not much complex but making it performance concerned requires lots of learning and complexity.

The points to be considered well in these steps are:

1. Check access restriction for page getting crawled (robots.txt)

2. Bandwidth usage of spider host

3. Bandwidth usage for the site being crawled

4. The overall load on the website getting crawled

5. If the page requires authenticated login and/or some cookies available at visitor’s browser to allow visiting, in this case we need to build some functionality to achieve this.

6. Checking of content-type before downloading whole stream. Its possible response might have unwanted content other than our interested HTML (like media streams).

Good article: How To - Write a web crawler in c#

http://www.thecodinghumanist.com/Content/

HowToWriteAWebCrawlerInCSharp.aspx

Uncompleted article from the geek’s corner

http://thegeekscorner.googlepages.com/

csharp_multithreaded_web_spider


Focus on Technology:
For better performance there are few options like multi-threading and asynchronous calls, now which to use or use both together is totally depends on requirement of application. ThreadPool is a thing need to
be concerned while working with multi-threading

Step 2: Store captured data in local file-system

After getting HTML from web-page (seed), we need to store this html somewhere in
local file-system to perform other actions later.

The actions would get performed on this stored HTML are:

- Extract links from this html and build crawl-frontier for next Seed-list

- Extract data required and gather information. Up on this data the tools can be made like search engine, data repository, etc.

The points to be considered in these steps are:

1. How and where to store captured data, either database or file-system.

2. Both database and file-system have its own pros and cons.

3. The data stored should have support of Unicode so multilingual data can’t get affected.

4. The files or data must get deleted after finishing its task; otherwise it tends to
create junk-bin of terabytes of space in little time.

This step itself contains one sub-step called Crawl-Frontier list building: The links available in the page are most likely to be a new seed (aka Crawl frontier)

This step also has some points to be considered like:

1. Links going to be added is referring to external website or own.

2. Link going to be added is allowed to get crawled in robots.txt

3. Is newly added link already available there in seed list or frontier list?

4. File-storage space availability and security.

5. And while using database as storage, database connection pool and some twicks to enhance database operations should be considered.

Focus on Technology:

It seems rather than going for file-system Database Storage is better option as it supports Unicode natively.

For extracting links I’ve used library from codeplex called HTML Agility Pack http://www.codeplex.com/htmlagilitypack, its nice and useful.

Step 3: Parse stored HTML

After getting our required data available locally we have to start working with parsing and capturing required information from particular entry.This parsing requires lots of complexity and learning of best suited parsing method.Parser are never same for all webpage it’s totally specific (to particular webpage or module if site follows same presentation across our required module), so this step is little bit mind stumbling.

HTML Agility Pack is somewhat helpful in this step http://www.codeplex.com/htmlagilitypack

After parsing HTML and getting required data from parsed html, the gathered data should get stored somewhere; this is our step 4.

Focus on Technology:

Rather than starting parsing HTML its better if this HTML get converted to XML first, so it would be easy.

Step 4: Gather required data:

    In this step we will actually build our repository by storing data fetched from step 3. Below shown example shows the live scenario how this steps could make our spider work.

Example:

Our goal: extract User’s first name and last name from orkut profile. We will go step by step in our defined steps.

1.Go to his/her profile URL and get output HTML

2. Store this output in our local file-system (e.g. Database or file)

a. Fetch all links available in this page and make crawl-frontier list(totally based on
requirement, this is not a mandatory step)

3. Parse this HTML and locate to the HTML elements which contains our required data
(here) User’s First and Last name (in our this case if once we are able to get
information for one user then it will work for all profile pages)

          4. And after getting our required data store these data in database.

So this is all about how spider framework works.

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

 

Try procrastination!

Procrastination is a dirty word. It doesn’t need to be. Procrastination that stems from a lack of discipline, causes you to lose sight of your goals, and results in decreased productivity deserves a bad rap. But what about postponing or avoiding things that can otherwise cause us pain and frustration if we apply the go-forward, “get it done” approach? Is this type of procrastination such a bad thing? We don’t see it as a bad thing. In fact, we suggest that you include strategic procrastination among your most important tools for increased productivity.

Let’s take today’s postponement as an example. We were scheduled to travel into a remote part of British Columbia to visit a pulp mill construction site tomorrow. Actually, it is a deconstruction site because the mill is being dismantled and shipped to China for reconstruction. It snowed in the area last night and is expected to snow again tomorrow. We could still visit the site because the weather hasn’t been bad enough to shut it down. We simply figured that the place is dangerous enough as it is with all sorts of concrete and steel debris sticking up from the frozen ground. Adding a blanket of snow makes it worse. The travel to and from the site is also harder. We decided to put it off until next week and to cancel it altogether if things got worse in the meantime. This is an obvious example but the idea applies to more subtle things just as well.

There are a few good reasons to postpone things. Here is a list of seven places where you should consider applying a strategic postponement:

  1. Where problems go away with time. The above weather example is a typical instance of where time makes a problem go away. Snow melts and evaporates. Many medical problems go away with time. Don’t be too quick to order a back surgery when natural healing processes can do a much better job if given enough time.
  2. Where problems are best ignored. Email spam and quasi-spam is a great example of this. Going out and trying to stop the spammers and beating up on friends and associates who send you stuff you don’t want is likely going to be a waste of time and effort leading to increased frustration for everyone involved. Just ignore the spam and delay the responses to email that comes in multiples. A delayed but polite and short response to a group of emails from a friend or associate received over days, weeks or longer can save you time, effort and frustration.
  3. Where you have good back-up and support systems in place. Don’t feel overly obligated to arrange or attend a meeting where you have others who can take part or all of the load if you simply postpone the meeting. Many urgent meetings, whether scheduled or not, deserve to be postponed. Sometimes they become effectively canceled after a postponement because a constructive solution appears in the meantime.
  4. Where something more important comes up. Be careful to properly assess the relative importance of things that come up. Skipping lunch to take an urgent call from your stockbroker is probably more important if you are being asked to sell than to buy. Postpone the call rather than skip lunch if you value your health.
  5. Where you are getting into a deal. Most Japanese business people are experts at procrastinating when being asked to get into a new deal or venture. This gives them time to carefully consider the relevant aspects and prepare for whatever consequences there are. Once in the deal, you should be fully prepared to follow through. Don’t be too quick to buy into stuff.
  6. Where you are tired, hungry or angry. This should be obvious but often isn’t. If you need to rest, sleep or cool down, postpone whatever it is that is preventing you from obtaining your basic needs. For instance, if you haven’t slept more than four hours in the past day or if you are feeling ill, it would probably be a good idea to postpone any major decisions.
  7. Where people are on your back because you are known to be a doer. Rather than going ahead and doing everything you are asked to do every time, depending on your position and priorities, procrastinate once in a while. Sometimes a good approach is to use someone else’s tendency to procrastinate in your defense. For example, if someone asks you to do something right away, respond by requesting a prerequisite to your going ahead. Maybe request an approval, budget, briefing paper or other useful piece that will help with the overall outcome. Be careful not to create useless work by asking for something irrelevant that does not add value to the process.

There is no need to sweat all the stuff that comes your way as soon as it comes. By applying these Strategic Postponement tools, you will be able to increase your overall productivity, enhance your well-being, and more effectively move toward your goals at a pace of your choosing. Feel free to occasionally say “Not now, maybe later.”

[Originally Posted At Lifehack.org TatsuyaNakagawa on 2/1/08]

bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark bookmark

Older Posts »