Archive for May, 2009

Slicing InfoTech 100

Update (Friday, May 22, 2009 – 02:32 PM): After I did all that geeking out I found a much better data representation at Business Week website for 2009 info tech 100 which is much readily Excel importable and has lot of rich information to slice and dice by. I have saved the data in a google spreadsheet for future slicing and dicing, may be using a statistical language like R.

It all started with the newsflash that AMZN made #1 third time in a row on Business Week InfoTech 100.

I wanted to slice and dice the data:

      Rank   Company                            Country
      --------------------------------------------------
         1  AMAZON.COM                          U.S.
         2  ORACLE                              U.S.
         3  SAP                                 Germany
         4  INVENTEC                            Taiwan
         5  IBM                                 U.S.
         6  BHARTI AIRTEL                       India
         7  QUANTA COMPUTER                     Taiwan
         8  WISTRON                             Taiwan
         9  TENCENT HOLDINGS                    China
        10  ACER                                Taiwan
        11  ACCENTURE                           U.S.
        12  MTN GROUP                           S. Africa
        13  HTC                                 Taiwan
        14  RESEARCH IN MOTION                  Canada
        15  SAMSUNG ELECTRONICS                 Korea
        16  AMERICA MOVIL                       Mexico
        17  SWISSCOM                            Switz.
        18  SAIC                                U.S.
        19  APPLE                               U.S.
        20  HEWLETT-PACKARD                     U.S.
        21  DIMENSION DATA HOLDINGS             S. Africa
        22  MICROSOFT                           U.S.
        23  CHINA MOBILE                        China
        24  ILIAD                               France
        25  INFOSYS TECHNOLOGIES                India
        26  ASUSTEK COMPUTER                    Taiwan
        27  NETFLIX                             U.S.
        28  SK TELECOM                          Korea
        29  ERICSSON (LM) TELEFON               Sweden
        30  TATA CONSULTANCY SERVICES           India
        31  TELSTRA                             Australia
        32  VERIZON COMMUNICATIONS              U.S.
        33  BYD                                 China
        34  QUALCOMM                            U.S.
        35  TURKCELL ILETISIM HIZMETLERI        Turkey
        36  VODAFONE GROUP                      Britain
        37  GOOGLE                              U.S.
        38  DELL                                U.S.
        39  ZTE                                 China
        40  HON HAI PRECISION INDUSTRY          Taiwan
        41  COMPUTERSHARE                       Australia
        42  COMPUTER SCIENCES                   U.S.
        43  WIPRO                               India
        44  ENTEL-EMPRESA NACIONAL TELECOM      Chile
        45  TELE NORTE LESTE PARTICIPACO        Brazil
        46  EMBRATEL PARTICIPACOES              Brazil
        47  PRICELINE.COM                       U.S.
        48  TAIWAN SEMICONDUCTOR MFG.           Taiwan
        49  NOKIA                               Finland
        50  COMPAL ELECTRONICS                  Taiwan
        51  COGNIZANT TECH SOLUTIONS            U.S.
        52  SIMPLO TECHNOLOGY                   Taiwan
        53  LG ELECTRONICS                      Korea
        54  VTECH HOLDINGS                      Hong Kong
        55  AUTOMATIC DATA PROCESSING           U.S.
        56  CISCO SYSTEMS                       U.S.
        57  ROGERS COMMUNICATIONS               Canada
        58  MCAFEE                              U.S.
        59  FACTSET RESEARCH SYSTEMS            U.S.
        60  NIPPON TELEGRAPH & TELEPHONE        Japan
        61  BMC SOFTWARE                        U.S.
        62  AT&T                                U.S.
        63  SYNNEX                              U.S.
        64  AFFILIATED COMPUTER SVCS            U.S.
        65  LG DISPLAY                          Korea
        66  MULTI-FINELINE ELECTRONIX           U.S.
        67  L-3 COMMUNICATIONS HLDGS.           U.S.
        68  LG TELECOM                          Korea
        69  TECH DATA                           U.S.
        70  INTUIT                              U.S.
        71  EMC                                 U.S.
        72  BROADCOM                            U.S.
        73  AISINO                              China
        74  PC-WARE INFO TECHNOLOGIES           Germany
        75  SYBASE                              U.S.
        76  TOKYO ELECTRON                      Japan
        77  METROPCS COMMUNICATIONS             U.S.
        78  SPECTRIS                            Britain
        79  CHECK POINT SOFTWARE                Israel
        80  WESTERN DIGITAL                     U.S.
        81  AMPHENOL                            U.S.
        82  HOSIDEN                             Japan
        83  NATIONAL MOBILE TELECOM             Kuwait
        84  HARRIS                              U.S.
        85  FIDELITY NATIONAL INFO SVCS         U.S.
        86  SAMSUNG ELECTRO-MECHANICS           Korea
        87  NASPERS                             S. Africa
        88  HUTCHISON TELECOMMUNICATIONS        China
        89  SYNOPSYS                            U.S.
        90  TERADATA                            U.S.
        91  CA                                  U.S.
        92  GEMALTO                             Neth.
        93  JUNIPER NETWORKS                    U.S.
        94  TKH GROUP                           Neth.
        95  GROUPE STERIA                       France
        96  TELEFONICA DE ARGENTINA             Argentina
        97  NETAPP                              U.S.
        98  FISERV                              U.S.
        99  ADOBE SYSTEMS                       U.S.
       100  AMDOCS                              Britain

I tried to start in Excel by pasting (all in one column) and then trying Data->Text to Column feature.

Did not quite work as I expected as there were no clear delimiter and space was splitting company names.

So I turned to my favorite editor VIM and starting trying some regular expressions to make the data more Excel friendly.

Here are the commands that worked:

Remove spaces preceeding rank:
:%s/^\s\+//gc

Add | as delimiter:
:%s/\>\s\{2\}/|/gc

Remove extra space after the second column:
:%s/|\s\+/|/gc

Finally, I can paste this data, do Data -> Text to Column and do my favoirte Data->Filter on it to slice and dice. (I am still a Excel tyro so Pivot Tables scare me but it was not too bad for simple analysis like this).

So here is what I found out:

Out of the 100 companies

Country Total
Argentina 1
Australia 2
Canada 1
China 2
Finland 1
France 2
Germany 2
India 3
Japan 1
Korea 2
Kuwait 1
Mexico 1
Neth. 2
S. Africa 3
Sweden 1
Taiwan 4
Brazil 2
Britain 3
Canada 1
Chile 1
China 4
Hong Kong 1
India 1
Israel 1
Japan 2
Korea 4
Switz. 1
Taiwan 6
Turkey 1
U.S. 43
Grand Total 100
http://spreadsheets.google.com/ccc?key=ri5-MWbnrrLrmPebL6H5rMg

Leave a Comment