9.1 KiB
jupytext  kernelspec 

[{text_representation [{extension .md} {format_name myst} {format_version 0.13} {jupytext_version 1.11.1}]}]  [{display_name Python 3} {language python} {name python3}] 
Saving and sharing your NumPy arrays
What you'll learn
You'll save your NumPy arrays as zipped files and humanreadable commadelimited files i.e. *.csv. You will also learn to load both of these file types back into NumPy workspaces.
What you'll do
You'll learn two ways of saving and reading filesas compressed and as text filesthat will serve most of your storage needs in NumPy.
 You'll create two 1D arrays and one 2D array
 You'll save these arrays to files
 You'll remove variables from your workspace
 You'll load the variables from your saved file
 You'll compare zipped binary files to humanreadable delimited files
 You'll finish with the skills of saving, loading, and sharing NumPy arrays
What you'll need
 NumPy
 readwrite access to your working directory
Load the necessary functions using the following command.
import numpy as np
In this tutorial, you will use the following Python, IPython magic, and NumPy functions:
+++
Create your arrays
Now that you have imported the NumPy library, you can make a couple of
arrays; let's start with two 1D arrays, x
and y
, where y = x**2
.You
will assign x
to the integers from 0 to 9 using
np.arange
.
x = np.arange(0, 10, 1)
y = x ** 2
print(x)
print(y)
Save your arrays with NumPy's savez
Now you have two arrays in your workspace,
x: [0 1 2 3 4 5 6 7 8 9]
y: [ 0 1 4 9 16 25 36 49 64 81]
The first thing you will do is save them to a file as zipped arrays
using
savez
.
You will use two options to label the arrays in the file,
x_axis = x
: this option is assigning the namex_axis
to the variablex
y_axis = y
: this option is assigning the namey_axis
to the variabley
np.savez("x_ysquared.npz", x_axis=x, y_axis=y)
Remove the saved arrays and load them back with NumPy's load
In your current working directory, you should have a new file with the
name x_ysquared.npz
. This file is a zipped binary of the two arrays,
x
and y
. Let's clear the workspace and load the values back in. This
x_ysquared.npz
file contains two NPY
format
files. The NPY format is a native binary
format. You cannot read
the numbers in a standard text editor or spreadsheet.
 remove
x
andy
from the workspaec withdel
 load the arrays into the workspace in a dictionary with
np.load
To see what variables are in the workspace, use the Jupyter/IPython
"magic" command
whos
.
del x, y
%whos
load_xy = np.load("x_ysquared.npz")
print(load_xy.files)
whos
Reassign the NpzFile arrays to x
and y
You've now created the dictionary with an NpzFile
type. The
included files are x_axis
and y_axis
that you defined in your
savez
command. You can reassign x
and y
to the load_xy
files.
x = load_xy["x_axis"]
y = load_xy["y_axis"]
print(x)
print(y)
Success
You have created, saved, deleted, and loaded the variables x
and y
using savez
and load
. Nice work.
Another option: saving to humanreadable csv
Let's consider another scenario, you want to share x
and y
with
other people or other programs. You may need humanreadable text file
that is easier to share. Next, you use the
savetxt
to save x
and y
in a comma separated value file, x_ysquared.csv
.
The resulting csv is composed of ASCII characters. You can load the file
back into NumPy or read it with other programs.
Rearrange the data into a single 2D array
First, you have to create a single 2D array from your two 1D arrays. The
csvfiletype is a spreadsheetstyle dataset. The csv arranges numbers in
rowsseparated by new linesand columnsseparated by commas. If the
data is more complex e.g. multiple 2D arrays or higher dimensional
arrays, it is better to use savez
. Here, you use
two NumPy functions to format the data:

np.block
: this function appends arrays together into a 2D array 
np.newaxis
: this function forces the 1D array into a 2D column vector with 10 rows and 1 column.
array_out = np.block([x[:, np.newaxis], y[:, np.newaxis]])
print("the output array has shape ", array_out.shape, " with values:")
print(array_out)
Save the data to csv file using savetxt
You use savetxt
with a three options to make your file easier to read:
X = array_out
: this option tellssavetxt
to save your 2D array,array_out
, to the filex_ysquared.csv
header = 'x, y'
: this option writes a header before any data that labels the columns of the csvdelimiter = ','
: this option tellssavetxt
to place a comma between each column in the file
np.savetxt("x_ysquared.csv", X=array_out, header="x, y", delimiter=",")
Open the file, x_ysquared.csv
, and you'll see the following:
# x, y
0.000000000000000000e+00,0.000000000000000000e+00
1.000000000000000000e+00,1.000000000000000000e+00
2.000000000000000000e+00,4.000000000000000000e+00
3.000000000000000000e+00,9.000000000000000000e+00
4.000000000000000000e+00,1.600000000000000000e+01
5.000000000000000000e+00,2.500000000000000000e+01
6.000000000000000000e+00,3.600000000000000000e+01
7.000000000000000000e+00,4.900000000000000000e+01
8.000000000000000000e+00,6.400000000000000000e+01
9.000000000000000000e+00,8.100000000000000000e+01
Our arrays as a csv file
There are two features that you shoud notice here:
 NumPy uses
#
to ignore headings when usingloadtxt
. If you're usingloadtxt
with other csv files, you can skip header rows withskiprows = <number_of_header_lines>
.  The integers were written in scientific notation. You can specify
the format of the text using the
savetxt
option,fmt =
, but it will still be written with ASCII characters. In general, you cannot preserve the type of ASCII numbers asfloat
orint
.
Now, delete x
and y
again and assign them to your columns in xy_squared.csv
.
del x, y
load_xy = np.loadtxt("x_ysquared.csv", delimiter=",")
load_xy.shape
x = load_xy[:, 0]
y = load_xy[:, 1]
print(x)
print(y)
Success, but remember your types
When you saved the arrays to the csv file, you did not preserve the
int
type. When loading the arrays back into your workspace the default process will be to load the csv file as a 2D floating point array e.g. load_xy.dtype == 'float64'
and load_xy.shape == (10, 2)
.
+++
Wrapping up
In conclusion, you can create, save, and load arrays in NumPy. Saving arrays makes sharing your work and collaboration much easier. There are other ways Python can save data to files, such as pickle, but savez
and savetxt
will serve most of your storage needs for future NumPy work and sharing with other people, respectively.
Next steps: you can import data with missing values from Importing with genfromtext or learn more about general NumPy IO with Reading and Writing Files.