Posts

Showing posts from February, 2014

SQL Indexes

Clustered Indexes: Use B-Tree algorithm. Only one index for a table. Clustered indexes are similar to a telephone directory where you search a person's name alphabetically and get his phone number there only. CLUSTERED INDEX SCAN: When table with crusted index is accessed The query doesn't use the non-clustered index in the table. The table does not have any non-clustered index. It will be bad when large data with most columns and rows are retrieved CLUSTERED INDEX SEEK: When Table with clustered index is accessed and query locates specific rows in B+ tree. It will be always good. Even though evaluate the possibility of non-clustered index. Non Clustered Indexes: Use B-Tree algorithm. For SQL 2005 249 Non Clustered Index. It is 999 for SQL 2008. Non Clustered indexes are similar to the Index of a book where you get the page number of the item you were searching for. Then turn to that page and read what you were looking for. ...

Big Data and Hadoop – Part 2

Apache Hadoop is a fast-growing big data framework. Advantages: Problems with Traditional Large-Scale Systems Processor-bound and lots of complex processing with bigger computers ( changed with distributed computers) Programming complexity Keeping data and processes in sync Finite Bandwidth Partial Failures Distributed Systems: The Data Bottleneck Traditionally, data is stored in a central location Data is copied to processors ar runtime Fine for limited amount of data Types of Analysis with Hadoop: Text Mining Index Building Graph Creation and Analysis Pattern Recognition Collaborative Filtering Prediction Models Sentiment Analysis Risk Assessment Nature of Analysis Batch Processing Parallel Execution Distributed Data Hadoop Users: Black Berry – Growth of data, Analysis Ad hoc Queries took too much time CBS Interactive – Cross site and Historical Analysis, Identify User Pattern Nokia - 3G Digital Modeling, Us...