Keras: The python Deep Learning Library
Keras简介
Keras是一个高层神经网络API,由纯Python编写而成并基Tensorflow、Theano及CNTK后端;Keras为支持快速实验而生,具有以下特点:
- 简易而快速的原型设计,具有高度模块化,极简,和可扩充性
- 支持CNN和RNN,以及二者的结合
- 无缝CPU和GPU切换
Keras适用的Python版本为:Python 2.7-3.6
Keras的核心数据结构是“模型”,它是一种组织网络层的方式;Keras主要的模型有Sequential,Functional;Sequential是一系列网络层按顺序构成的栈,是单输入单输出的,所以编译速度快,操作比较简单;Functional为函数式模型,为多输入多输出模型。
优化方法(optimizers)是编译Keras模型必要的两个参数之一,可以在调用model.compile()之前初始化优化器对象:
sgd = optimizers.SGD(lr=0.01,decay=1e-6,momentum=0.9,nesterov=True)
model.compile(loss='mean_Squared_error',optimizer=sgd)
也可以在调用model.compile()时传递一个预定义优化器名,此时优化器的参数使用默认值:
# pass optimizer by name: default parameters will be uesed
model.compile(loss='mean_squared_error',optimizer='sgd')
参数clipnorm和clipvalue是所有优化器都可以使用的参数,用于对梯度进行裁剪。 SGD,随机梯度下降法,支持运量参数,支持学习衰减率,支持Nesterov运量;参数如下:
- lr: 大于等于0的浮点数,学习率
- momentum: 大于等于0的浮点数,运量参数
- decay: 大于等于0的浮点数,每次更新后的学习率衰减值
- nesterov: 布尔值,确定是否使用Nesterov运量
自带的有以下几种:
1) SGD,随机梯度下降法,用法例如:
keras.optimizers.SGD(lr=0.01,momentum=0.0,decay=0.0,nesterov=False)
2) RMSprop,此优化器是对递归神经网络的一个较好选择,除学习率可调整外,其它参数建议保持默认:
keras.optimizers.RMSprop(lr=0.001,rho=0.9,epsilon=1e-06)
参数rho大于等于0的浮点数,epsilon大于等于0的小浮点数,防止除0错误
3)Adagrad,建议保持默认参数:
keras.optimizers.Adagrad(lr=0.01,epsilon=1e-06)
4) Adadelta,建议保持默认参数:
keras.optimizer.Adadelta(lr=1.0,rho=0.95,epsilon=1e-6)
5) Adam,优化器的默认值来源于参考文献[4]:
keras.optimizers.Adam(lr=0.001,beta_1=0.9,beta_2=0.999,epsilon=1e-08)
参数 beta_1/beta_2: 浮点数,0<beta<1,通常很接近1
6)TFOptimizer,TF(Tensorflow)优化器的包装器:
keras.optimizers.TFOpitimizer(optimizer)
Sequential模型编程
Sequential模型的简单Demo如下:
#Getting started with the Keras Sequential model
#example one: MLP for binary classification
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
# Generate dummy data
x_train = np.random.random((1000, 20))
y_train = np.random.randint(2, size=(1000, 1))
x_test = np.random.random((100, 20))
y_test = np.random.randint(2, size=(100, 1))
# Stacking layer is as easy as .add()
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
# Configure its learning process with .compile()
model.compile(loss='binary_crossentropy',optimizer='rmsprop',metrics=['accuracy'])
model.fit(x_train, y_train,epochs=20,batch_size=128)
score = model.evaluate(x_test, y_test, batch_size=128)
运行结果如下:
简单回归模型:
#-*-coding:utf-8-*-
# example two: Regressor
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
# create some data
X = np.linspace(-1, 1, 200)
np.random.shuffle(X) # randomize the data
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200, ))
# plot data
plt.scatter(X, Y)
plt.show()
X_train, Y_train = X[:160], Y[:160] # first 160 data points
X_test, Y_test = X[160:], Y[160:] # last 40 data points
# build a neural network from the 1st layer to the last layer
model = Sequential()
model.add(Dense(units=1, input_dim=1))
# choose loss function and optimizing method
model.compile(loss='mse', optimizer='sgd')
# training
print('Training -----------')
for step in range(301):
cost = model.train_on_batch(X_train, Y_train)
if step % 100 == 0:
print('train cost: ', cost)
# test
print('\nTesting ------------')
cost = model.evaluate(X_test, Y_test, batch_size=40)
print('test cost:', cost)
W, b = model.layers[0].get_weights()
print('Weights=', W, '\nbiases=', b)
# plotting the prediction
Y_pred = model.predict(X_test)
plt.scatter(X_test, Y_test)
plt.plot(X_test, Y_pred)
plt.show()
运行结果:
【Ref】
[1] Keras Documentation
[2] Keras中文文档
[3] MorvanZhou tutorials
[4] Adam - A Method for Stochastic Optimization