Datatype: CSV#

CSV stands for comma separated values. It is a plain text file that contains a list of data. These files are often used to exchange data between different applications.#

To open a CSV file, you need to install the pandas library. Pandas is a open source library used in data analysis.

In this example, we are using the Growth Curves file.

Go through the following steps to open a csv file:

  1. Install pandas library with the pip command

pip install pandas
conda install -c conda-forge pandoc
pip install --upgrade pandoc
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.


Note: you may need to restart the kernel to use updated packages.
  1. Import pandas library as pd

import pandas as pd
  1. Use the read_csv method (which comes with the pandas library), and pass your csv file path as a parameter

  2. Use the print function to view the output

df = pd.read_csv(r'Growth_Curves.csv')
print(df)
                                            LifeZone    AIDBSPP  0          1  \
0                                       Tropical Dry  Red alder  0   0.003347   
1                                     Tropical Moist  Red alder  0  19.072762   
2  Tropical Premontane Wet, Transition to Basal -...  Red alder  0  11.807128   
3  Tropical Premontane Wet, Transition to Basal -...  Red alder  0  18.881966   
4                                       Tropical Wet  Red alder  0  17.447870   
5                                    Premontane Rain  Red alder  0  18.725175   

           2          3          4          5           6           7  ...  \
0   0.089069   0.543011   1.816663   4.385234    8.627880   14.765182  ...   
1  37.493209  55.283649  72.465631  89.059964  105.086747  120.565391  ...   
2  23.364268  34.676713  45.749643  56.588130   67.197138   77.581525  ...   
3  37.176591  54.902144  72.076328  88.716292  104.838655  120.459516  ...   
4  34.574209  51.384943  67.885887  84.082750   99.981136  115.586545  ...   
5  37.075897  55.059652  72.683780  89.955473  106.881777  123.469600  ...   

          191         192         193         194         195         196  \
0  329.510638  329.510638  329.510638  329.510638  329.510638  329.510638   
1  556.935638  556.960397  556.984310  557.007405  557.029710  557.051253   
2  548.300479  548.498635  548.692597  548.882451  549.068286  549.250187   
3  605.569245  605.614412  605.658173  605.700574  605.741656  605.781459   
4  919.682397  920.182279  920.672950  921.154578  921.627331  922.091372   
5  916.619721  917.014933  917.402243  917.781807  918.153781  918.518317   

          197         198         199         200  
0  329.510638  329.510638  329.510638  329.510638  
1  557.072058  557.092152  557.111559  557.130302  
2  549.428236  549.602515  549.773104  549.940081  
3  605.820025  605.857391  605.893595  605.928673  
4  922.546862  922.993958  923.432814  923.863584  
5  918.875563  919.225665  919.568765  919.905005  

[6 rows x 203 columns]
  1. To check the top five rows:

df.head()
LifeZone AIDBSPP 0 1 2 3 4 5 6 7 ... 191 192 193 194 195 196 197 198 199 200
0 Tropical Dry Red alder 0 0.003347 0.089069 0.543011 1.816663 4.385234 8.627880 14.765182 ... 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638
1 Tropical Moist Red alder 0 19.072762 37.493209 55.283649 72.465631 89.059964 105.086747 120.565391 ... 556.935638 556.960397 556.984310 557.007405 557.029710 557.051253 557.072058 557.092152 557.111559 557.130302
2 Tropical Premontane Wet, Transition to Basal -... Red alder 0 11.807128 23.364268 34.676713 45.749643 56.588130 67.197138 77.581525 ... 548.300479 548.498635 548.692597 548.882451 549.068286 549.250187 549.428236 549.602515 549.773104 549.940081
3 Tropical Premontane Wet, Transition to Basal -... Red alder 0 18.881966 37.176591 54.902144 72.076328 88.716292 104.838655 120.459516 ... 605.569245 605.614412 605.658173 605.700574 605.741656 605.781459 605.820025 605.857391 605.893595 605.928673
4 Tropical Wet Red alder 0 17.447870 34.574209 51.384943 67.885887 84.082750 99.981136 115.586545 ... 919.682397 920.182279 920.672950 921.154578 921.627331 922.091372 922.546862 922.993958 923.432814 923.863584

5 rows × 203 columns

  1. To check the bottom five rows:

df.tail()
LifeZone AIDBSPP 0 1 2 3 4 5 6 7 ... 191 192 193 194 195 196 197 198 199 200
1 Tropical Moist Red alder 0 19.072762 37.493209 55.283649 72.465631 89.059964 105.086747 120.565391 ... 556.935638 556.960397 556.984310 557.007405 557.029710 557.051253 557.072058 557.092152 557.111559 557.130302
2 Tropical Premontane Wet, Transition to Basal -... Red alder 0 11.807128 23.364268 34.676713 45.749643 56.588130 67.197138 77.581525 ... 548.300479 548.498635 548.692597 548.882451 549.068286 549.250187 549.428236 549.602515 549.773104 549.940081
3 Tropical Premontane Wet, Transition to Basal -... Red alder 0 18.881966 37.176591 54.902144 72.076328 88.716292 104.838655 120.459516 ... 605.569245 605.614412 605.658173 605.700574 605.741656 605.781459 605.820025 605.857391 605.893595 605.928673
4 Tropical Wet Red alder 0 17.447870 34.574209 51.384943 67.885887 84.082750 99.981136 115.586545 ... 919.682397 920.182279 920.672950 921.154578 921.627331 922.091372 922.546862 922.993958 923.432814 923.863584
5 Premontane Rain Red alder 0 18.725175 37.075897 55.059652 72.683780 89.955473 106.881777 123.469600 ... 916.619721 917.014933 917.402243 917.781807 918.153781 918.518317 918.875563 919.225665 919.568765 919.905005

5 rows × 203 columns

  1. To check the type of data, pass the dataframe name to the type() method:

print(type(df))
<class 'pandas.core.frame.DataFrame'>
  1. To check the shape of your dataframe, use the below method:

df.shape
(6, 203)

There are total 6 rows, and 203 columns.

  1. To check column names:

df.columns
Index(['LifeZone', 'AIDBSPP', '0', '1', '2', '3', '4', '5', '6', '7',
       ...
       '191', '192', '193', '194', '195', '196', '197', '198', '199', '200'],
      dtype='object', length=203)
  1. To get more information about the data, use the given function:

print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Columns: 203 entries, LifeZone to 200
dtypes: float64(200), int64(1), object(2)
memory usage: 9.6+ KB
None