Tuesday, December 12, 2017

Pandas Kütüphanesi Ders - 4

Lesson 4



These tutorials are also available through an email course, please visit http://www.hedaro.com/pandas-tutorial to sign up today.
In this lesson were going to go back to the basics. We will be working with a small data set so that you can easily understand what I am trying to explain. We will be adding columns, deleting columns, and slicing the data many different ways. Enjoy!
In [1]:
# Import libraries
import pandas as pd
import sys
In [2]:
print('Python version ' + sys.version)
print('Pandas version: ' + pd.__version__)
Python version 3.5.1 |Anaconda custom (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)]
Pandas version: 0.20.1
In [3]:
# Our small data set
d = [0,1,2,3,4,5,6,7,8,9]

# Create dataframe
df = pd.DataFrame(d)
df
Out[3]:
0
00
11
22
33
44
55
66
77
88
99
In [4]:
# Lets change the name of the column
df.columns = ['Rev']
df
Out[4]:
Rev
00
11
22
33
44
55
66
77
88
99
In [5]:
# Lets add a column
df['NewCol'] = 5
df
Out[5]:
RevNewCol
005
115
225
335
445
555
665
775
885
995
In [6]:
# Lets modify our new column
df['NewCol'] = df['NewCol'] + 1
df
Out[6]:
RevNewCol
006
116
226
336
446
556
666
776
886
996
In [7]:
# We can delete columns
del df['NewCol']
df
Out[7]:
Rev
00
11
22
33
44
55
66
77
88
99
In [8]:
# Lets add a couple of columns
df['test'] = 3
df['col'] = df['Rev']
df
Out[8]:
Revtestcol
0030
1131
2232
3333
4434
5535
6636
7737
8838
9939
In [9]:
# If we wanted, we could change the name of the index
i = ['a','b','c','d','e','f','g','h','i','j']
df.index = i
df
Out[9]:
Revtestcol
a030
b131
c232
d333
e434
f535
g636
h737
i838
j939
We can now start to select pieces of the dataframe using loc.
In [10]:
df.loc['a']
Out[10]:
Rev     0
test    3
col     0
Name: a, dtype: int64
In [11]:
# df.loc[inclusive:inclusive]
df.loc['a':'d']
Out[11]:
Revtestcol
a030
b131
c232
d333
In [12]:
# df.iloc[inclusive:exclusive]
# Note: .iloc is strictly integer position based. It is available from [version 0.11.0] (http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#v0-11-0-april-22-2013) 
df.iloc[0:3]
Out[12]:
Revtestcol
a030
b131
c232
We can also select using the column name.
In [13]:
df['Rev']
Out[13]:
a    0
b    1
c    2
d    3
e    4
f    5
g    6
h    7
i    8
j    9
Name: Rev, dtype: int64
In [14]:
df[['Rev', 'test']]
Out[14]:
Revtest
a03
b13
c23
d33
e43
f53
g63
h73
i83
j93
In [15]:
# df.ix[rows,columns]
# replaces the deprecated ix function
#df.ix[0:3,'Rev']
df.loc[df.index[0:3],'Rev']
Out[15]:
a    0
b    1
c    2
Name: Rev, dtype: int64
In [16]:
# replaces the deprecated ix function
#df.ix[5:,'col']
df.loc[df.index[5:],'col']
Out[16]:
f    5
g    6
h    7
i    8
j    9
Name: col, dtype: int64
In [17]:
# replaces the deprecated ix function
#df.ix[:3,['col', 'test']]
df.loc[df.index[:3],['col', 'test']]
Out[17]:
coltest
a03
b13
c23
There is also some handy function to select the top and bottom records of a dataframe.
In [18]:
# Select top N number of records (default = 5)
df.head()
Out[18]:
Revtestcol
a030
b131
c232
d333
e434
In [19]:
# Select bottom N number of records (default = 5)
df.tail()
Out[19]:
Revtestcol
f535
g636
h737
i838
j939
This tutorial was created by HEDARO

No comments:

Post a Comment

file tree for nodejs project

 find . \( -path "*/node_modules" -o -path "*/.git" \) -prune -o -print | tree -a -I 'node_modules|.git'