Tuesday, December 12, 2017

pandas Eğitim Dökümanı

Pandas Eğitim Dökümanı

  • DERS - 01: - Importing libraries - Creating data sets - Creating data frames - Reading from CSV - Exporting to CSV - Finding maximums - Plotting data
  • DERS - 02: - Reading from TXT - Exporting to TXT - Selecting top/bottom records - Descriptive statistics - Grouping/sorting data
  • DERS - 03: - Creating functions - Reading from EXCEL - Exporting to EXCEL - Outliers - Lambda functions - Slice and dice data
  • DERS - 04: - Adding/deleting columns - Index operations
  • DERS - 05: - Stack/Unstack/Transpose functions
  • DERS - 06 - GroupBy function
  • DERS - 07 - Ways to calculate outliers
  • DERS - 08: - Read from Microsoft SQL databases
  • DERS - 09: - Export to CSV/EXCEL/TXT
  • DERS - 10: - Converting between different kinds of formats
  • DERS - 11: - Combining data from various sources

Pandas Kütüphanesi Ders - 11

Lesson 11



These tutorials are also available through an email course, please visit http://www.hedaro.com/pandas-tutorial to sign up today.
Grab data from multiple excel files and merge them into a single dataframe.
In [1]:
import pandas as pd
import matplotlib
import os
import sys
%matplotlib inline
In [2]:
print('Python version ' + sys.version)
print('Pandas version ' + pd.__version__)
print('Matplotlib version ' + matplotlib.__version__)
Python version 3.5.1 |Anaconda custom (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)]
Pandas version 0.20.1
Matplotlib version 1.5.1

Create 3 excel files

In [3]:
# Create DataFrame
d = {'Channel':[1], 'Number':[255]}
df = pd.DataFrame(d)
df
Out[3]:
ChannelNumber
01255
In [4]:
# Export to Excel

df.to_excel('test1.xlsx', sheet_name = 'test1', index = False)
df.to_excel('test2.xlsx', sheet_name = 'test2', index = False)
df.to_excel('test3.xlsx', sheet_name = 'test3', index = False)
print('Done')
Done

Place all three Excel files into a DataFrame

Get a list of file names but make sure there are no other excel files present in the folder.
In [5]:
# List to hold file names
FileNames = []

# Your path will be different, please modify the path below.
os.chdir(r"C:\Users\david\notebooks\update")

# Find any file that ends with ".xlsx"
for files in os.listdir("."):
    if files.endswith(".xlsx"):
        FileNames.append(files)
        
FileNames
Out[5]:
['test1.xlsx', 'test2.xlsx', 'test3.xlsx']
Create a function to process all of the excel files.
In [6]:
def GetFile(fnombre):

    # Path to excel file
    # Your path will be different, please modify the path below.
    location = r'C:\Users\david\notebooks\update\\' + fnombre
    
    # Parse the excel file
    # 0 = first sheet
    df = pd.read_excel(location, 0)
    
    # Tag record to file name
    df['File'] = fnombre
    
    # Make the "File" column the index of the df
    return df.set_index(['File'])
Go through each file name, create a dataframe, and add it to a list.
i.e.
df_list = [df, df, df]
In [7]:
# Create a list of dataframes
df_list = [GetFile(fname) for fname in FileNames]
df_list
Out[7]:
[            Channel  Number
 File                       
 test1.xlsx        1     255,             Channel  Number
 File                       
 test2.xlsx        1     255,             Channel  Number
 File                       
 test3.xlsx        1     255]
In [8]:
# Combine all of the dataframes into one
big_df = pd.concat(df_list)
big_df
Out[8]:
ChannelNumber
File
test1.xlsx1255
test2.xlsx1255
test3.xlsx1255
In [9]:
big_df.dtypes
Out[9]:
Channel    int64
Number     int64
dtype: object
In [10]:
# Plot it!
big_df['Channel'].plot.bar();
This tutorial was created by HEDARO

Pandas Kütüphanesi Ders - 10

Lesson 10



These tutorials are also available through an email course, please visit http://www.hedaro.com/pandas-tutorial to sign up today.
  • From DataFrame to Excel
  • From Excel to DataFrame
  • From DataFrame to JSON
  • From JSON to DataFrame
In [1]:
import pandas as pd
import sys
In [2]:
print('Python version ' + sys.version)
print('Pandas version ' + pd.__version__)
Python version 3.5.1 |Anaconda custom (64-bit)| (default, Feb 16 2016, 09:49:46) [MSC v.1900 64 bit (AMD64)]
Pandas version 0.20.1

From DataFrame to Excel

In [3]:
# Create DataFrame
d = [1,2,3,4,5,6,7,8,9]
df = pd.DataFrame(d, columns = ['Number'])
df
Out[3]:
Number
01
12
23
34
45
56
67
78
89
In [4]:
# Export to Excel
df.to_excel('Lesson10.xlsx', sheet_name = 'testing', index = False)
print('Done')
Done

From Excel to DataFrame

In [5]:
# Path to excel file
# Your path will be different, please modify the path below.
location = r'C:\Users\david\notebooks\update\Lesson10.xlsx'

# Parse the excel file
df = pd.read_excel(location, 0)
df.head()
Out[5]:
Number
01
12
23
34
45
In [6]:
df.dtypes
Out[6]:
Number    int64
dtype: object
In [7]:
df.tail()
Out[7]:
Number
45
56
67
78
89

From DataFrame to JSON

In [8]:
df.to_json('Lesson10.json')
print('Done')
Done

From JSON to DataFrame

In [9]:
# Your path will be different, please modify the path below.
jsonloc = r'C:\Users\david\notebooks\update\Lesson10.json'

# read json file
df2 = pd.read_json(jsonloc)
In [10]:
df2
Out[10]:
Number
01
12
23
34
45
56
67
78
89
In [11]:
df2.dtypes
Out[11]:
Number    int64
dtype: object
This tutorial was created by HEDARO

file tree for nodejs project

 find . \( -path "*/node_modules" -o -path "*/.git" \) -prune -o -print | tree -a -I 'node_modules|.git'