Datatype: CSV#
CSV stands for comma separated values. It is a plain text file that contains a list of data. These files are often used to exchange data between different applications.#
To open a CSV file, you need to install the pandas library. Pandas is a open source library used in data analysis.
In this example, we are using the Growth Curves file.
Go through the following steps to open a csv file:
Install pandas library with the pip command
pip install pandas
conda install -c conda-forge pandoc
pip install --upgrade pandoc
Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done
# All requested packages already installed.
Note: you may need to restart the kernel to use updated packages.
Import pandas library as pd
import pandas as pd
Use the read_csv method (which comes with the pandas library), and pass your csv file path as a parameter
Use the print function to view the output
df = pd.read_csv(r'Growth_Curves.csv')
print(df)
LifeZone AIDBSPP 0 1 \
0 Tropical Dry Red alder 0 0.003347
1 Tropical Moist Red alder 0 19.072762
2 Tropical Premontane Wet, Transition to Basal -... Red alder 0 11.807128
3 Tropical Premontane Wet, Transition to Basal -... Red alder 0 18.881966
4 Tropical Wet Red alder 0 17.447870
5 Premontane Rain Red alder 0 18.725175
2 3 4 5 6 7 ... \
0 0.089069 0.543011 1.816663 4.385234 8.627880 14.765182 ...
1 37.493209 55.283649 72.465631 89.059964 105.086747 120.565391 ...
2 23.364268 34.676713 45.749643 56.588130 67.197138 77.581525 ...
3 37.176591 54.902144 72.076328 88.716292 104.838655 120.459516 ...
4 34.574209 51.384943 67.885887 84.082750 99.981136 115.586545 ...
5 37.075897 55.059652 72.683780 89.955473 106.881777 123.469600 ...
191 192 193 194 195 196 \
0 329.510638 329.510638 329.510638 329.510638 329.510638 329.510638
1 556.935638 556.960397 556.984310 557.007405 557.029710 557.051253
2 548.300479 548.498635 548.692597 548.882451 549.068286 549.250187
3 605.569245 605.614412 605.658173 605.700574 605.741656 605.781459
4 919.682397 920.182279 920.672950 921.154578 921.627331 922.091372
5 916.619721 917.014933 917.402243 917.781807 918.153781 918.518317
197 198 199 200
0 329.510638 329.510638 329.510638 329.510638
1 557.072058 557.092152 557.111559 557.130302
2 549.428236 549.602515 549.773104 549.940081
3 605.820025 605.857391 605.893595 605.928673
4 922.546862 922.993958 923.432814 923.863584
5 918.875563 919.225665 919.568765 919.905005
[6 rows x 203 columns]
To check the top five rows:
df.head()
LifeZone | AIDBSPP | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ... | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Tropical Dry | Red alder | 0 | 0.003347 | 0.089069 | 0.543011 | 1.816663 | 4.385234 | 8.627880 | 14.765182 | ... | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 | 329.510638 |
1 | Tropical Moist | Red alder | 0 | 19.072762 | 37.493209 | 55.283649 | 72.465631 | 89.059964 | 105.086747 | 120.565391 | ... | 556.935638 | 556.960397 | 556.984310 | 557.007405 | 557.029710 | 557.051253 | 557.072058 | 557.092152 | 557.111559 | 557.130302 |
2 | Tropical Premontane Wet, Transition to Basal -... | Red alder | 0 | 11.807128 | 23.364268 | 34.676713 | 45.749643 | 56.588130 | 67.197138 | 77.581525 | ... | 548.300479 | 548.498635 | 548.692597 | 548.882451 | 549.068286 | 549.250187 | 549.428236 | 549.602515 | 549.773104 | 549.940081 |
3 | Tropical Premontane Wet, Transition to Basal -... | Red alder | 0 | 18.881966 | 37.176591 | 54.902144 | 72.076328 | 88.716292 | 104.838655 | 120.459516 | ... | 605.569245 | 605.614412 | 605.658173 | 605.700574 | 605.741656 | 605.781459 | 605.820025 | 605.857391 | 605.893595 | 605.928673 |
4 | Tropical Wet | Red alder | 0 | 17.447870 | 34.574209 | 51.384943 | 67.885887 | 84.082750 | 99.981136 | 115.586545 | ... | 919.682397 | 920.182279 | 920.672950 | 921.154578 | 921.627331 | 922.091372 | 922.546862 | 922.993958 | 923.432814 | 923.863584 |
5 rows × 203 columns
To check the bottom five rows:
df.tail()
LifeZone | AIDBSPP | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | ... | 191 | 192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Tropical Moist | Red alder | 0 | 19.072762 | 37.493209 | 55.283649 | 72.465631 | 89.059964 | 105.086747 | 120.565391 | ... | 556.935638 | 556.960397 | 556.984310 | 557.007405 | 557.029710 | 557.051253 | 557.072058 | 557.092152 | 557.111559 | 557.130302 |
2 | Tropical Premontane Wet, Transition to Basal -... | Red alder | 0 | 11.807128 | 23.364268 | 34.676713 | 45.749643 | 56.588130 | 67.197138 | 77.581525 | ... | 548.300479 | 548.498635 | 548.692597 | 548.882451 | 549.068286 | 549.250187 | 549.428236 | 549.602515 | 549.773104 | 549.940081 |
3 | Tropical Premontane Wet, Transition to Basal -... | Red alder | 0 | 18.881966 | 37.176591 | 54.902144 | 72.076328 | 88.716292 | 104.838655 | 120.459516 | ... | 605.569245 | 605.614412 | 605.658173 | 605.700574 | 605.741656 | 605.781459 | 605.820025 | 605.857391 | 605.893595 | 605.928673 |
4 | Tropical Wet | Red alder | 0 | 17.447870 | 34.574209 | 51.384943 | 67.885887 | 84.082750 | 99.981136 | 115.586545 | ... | 919.682397 | 920.182279 | 920.672950 | 921.154578 | 921.627331 | 922.091372 | 922.546862 | 922.993958 | 923.432814 | 923.863584 |
5 | Premontane Rain | Red alder | 0 | 18.725175 | 37.075897 | 55.059652 | 72.683780 | 89.955473 | 106.881777 | 123.469600 | ... | 916.619721 | 917.014933 | 917.402243 | 917.781807 | 918.153781 | 918.518317 | 918.875563 | 919.225665 | 919.568765 | 919.905005 |
5 rows × 203 columns
To check the type of data, pass the dataframe name to the type() method:
print(type(df))
<class 'pandas.core.frame.DataFrame'>
To check the shape of your dataframe, use the below method:
df.shape
(6, 203)
There are total 6 rows, and 203 columns.
To check column names:
df.columns
Index(['LifeZone', 'AIDBSPP', '0', '1', '2', '3', '4', '5', '6', '7',
...
'191', '192', '193', '194', '195', '196', '197', '198', '199', '200'],
dtype='object', length=203)
To get more information about the data, use the given function:
print(df.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6 entries, 0 to 5
Columns: 203 entries, LifeZone to 200
dtypes: float64(200), int64(1), object(2)
memory usage: 9.6+ KB
None