Datatype: CSV#

CSV stands for comma separated values. It is a plain text file that contains a list of data. These files are often used to exchange data between different applications.#

To open a CSV file, you need to install the pandas library. Pandas is a open source library used in data analysis.

In this example, we are using the Growth Curves file.

Go through the following steps to open a csv file:

Install pandas library with the pip command

pip install pandas
conda install -c conda-forge pandoc
pip install --upgrade pandoc

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.

Note: you may need to restart the kernel to use updated packages.

Import pandas library as pd

import pandas as pd

Use the read_csv method (which comes with the pandas library), and pass your csv file path as a parameter
Use the print function to view the output

df = pd.read_csv(r'Growth_Curves.csv')
print(df)

                                            LifeZone    AIDBSPP  0          1  \
                                     Tropical Dry  Red alder  0   0.003347   
                                   Tropical Moist  Red alder  0  19.072762   
Tropical Premontane Wet, Transition to Basal -...  Red alder  0  11.807128   
Tropical Premontane Wet, Transition to Basal -...  Red alder  0  18.881966   
                                     Tropical Wet  Red alder  0  17.447870   
                                  Premontane Rain  Red alder  0  18.725175   

           2          3          4          5           6           7  ...  \
 0.089069   0.543011   1.816663   4.385234    8.627880   14.765182  ...   
37.493209  55.283649  72.465631  89.059964  105.086747  120.565391  ...   
23.364268  34.676713  45.749643  56.588130   67.197138   77.581525  ...   
37.176591  54.902144  72.076328  88.716292  104.838655  120.459516  ...   
34.574209  51.384943  67.885887  84.082750   99.981136  115.586545  ...   
37.075897  55.059652  72.683780  89.955473  106.881777  123.469600  ...   

          191         192         193         194         195         196  \
329.510638  329.510638  329.510638  329.510638  329.510638  329.510638   
556.935638  556.960397  556.984310  557.007405  557.029710  557.051253   
548.300479  548.498635  548.692597  548.882451  549.068286  549.250187   
605.569245  605.614412  605.658173  605.700574  605.741656  605.781459   
919.682397  920.182279  920.672950  921.154578  921.627331  922.091372   
916.619721  917.014933  917.402243  917.781807  918.153781  918.518317   

          197         198         199         200  
329.510638  329.510638  329.510638  329.510638  
557.072058  557.092152  557.111559  557.130302  
549.428236  549.602515  549.773104  549.940081  
605.820025  605.857391  605.893595  605.928673  
922.546862  922.993958  923.432814  923.863584  
918.875563  919.225665  919.568765  919.905005  

[6 rows x 203 columns]

To check the top five rows:

df.head()

	LifeZone	AIDBSPP	1	2	3	4	5	6	7	...	191	192	193	194	195	196	197	198	199	200
0	Tropical Dry	Red alder	0.003347	0.089069	0.543011	1.816663	4.385234	8.627880	14.765182	...	329.510638	329.510638	329.510638	329.510638	329.510638	329.510638	329.510638	329.510638	329.510638	329.510638
1	Tropical Moist	Red alder	19.072762	37.493209	55.283649	72.465631	89.059964	105.086747	120.565391	...	556.935638	556.960397	556.984310	557.007405	557.029710	557.051253	557.072058	557.092152	557.111559	557.130302
2	Tropical Premontane Wet, Transition to Basal -...	Red alder	11.807128	23.364268	34.676713	45.749643	56.588130	67.197138	77.581525	...	548.300479	548.498635	548.692597	548.882451	549.068286	549.250187	549.428236	549.602515	549.773104	549.940081
3	Tropical Premontane Wet, Transition to Basal -...	Red alder	18.881966	37.176591	54.902144	72.076328	88.716292	104.838655	120.459516	...	605.569245	605.614412	605.658173	605.700574	605.741656	605.781459	605.820025	605.857391	605.893595	605.928673
4	Tropical Wet	Red alder	17.447870	34.574209	51.384943	67.885887	84.082750	99.981136	115.586545	...	919.682397	920.182279	920.672950	921.154578	921.627331	922.091372	922.546862	922.993958	923.432814	923.863584

5 rows × 203 columns

To check the bottom five rows:

df.tail()

	LifeZone	AIDBSPP	1	2	3	4	5	6	7	...	191	192	193	194	195	196	197	198	199	200
1	Tropical Moist	Red alder	19.072762	37.493209	55.283649	72.465631	89.059964	105.086747	120.565391	...	556.935638	556.960397	556.984310	557.007405	557.029710	557.051253	557.072058	557.092152	557.111559	557.130302
2	Tropical Premontane Wet, Transition to Basal -...	Red alder	11.807128	23.364268	34.676713	45.749643	56.588130	67.197138	77.581525	...	548.300479	548.498635	548.692597	548.882451	549.068286	549.250187	549.428236	549.602515	549.773104	549.940081
3	Tropical Premontane Wet, Transition to Basal -...	Red alder	18.881966	37.176591	54.902144	72.076328	88.716292	104.838655	120.459516	...	605.569245	605.614412	605.658173	605.700574	605.741656	605.781459	605.820025	605.857391	605.893595	605.928673
4	Tropical Wet	Red alder	17.447870	34.574209	51.384943	67.885887	84.082750	99.981136	115.586545	...	919.682397	920.182279	920.672950	921.154578	921.627331	922.091372	922.546862	922.993958	923.432814	923.863584
5	Premontane Rain	Red alder	18.725175	37.075897	55.059652	72.683780	89.955473	106.881777	123.469600	...	916.619721	917.014933	917.402243	917.781807	918.153781	918.518317	918.875563	919.225665	919.568765	919.905005

5 rows × 203 columns

To check the type of data, pass the dataframe name to the type() method:

print(type(df))

<class 'pandas.core.frame.DataFrame'>

To check the shape of your dataframe, use the below method:

df.shape

(6, 203)

There are total 6 rows, and 203 columns.

To check column names:

df.columns

Index(['LifeZone', 'AIDBSPP', '0', '1', '2', '3', '4', '5', '6', '7',
       ...
       '191', '192', '193', '194', '195', '196', '197', '198', '199', '200'],
      dtype='object', length=203)

To get more information about the data, use the given function:

print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Columns: 203 entries, LifeZone to 200
dtypes: float64(200), int64(1), object(2)
memory usage: 9.6+ KB
None