English translation
Generate synthetic time-series data
The essence of LSTM lies not in its name—but in how its gating mechanisms selectively discard outdated information, write in new information, and pass the current state forward to the next time step. When reading about LSTMs, sketching out each time step visually is often far more intuitive than studying formulas alone. This article focuses on implementation. Don’t just copy-paste code—verify each component systematically: the execution environment, input tensor shape, model invocation, and output interpretation.
I’ll verify four key aspects: input dimensionality, sequence length, hidden size, and which time step’s output is selected. Clarifying these four points prevents common implementation pitfalls.
In the previous article, we deeply examined the principles behind LSTM (Long Short-Term Memory networks), understanding how their internal units capture long-range dependencies in sequential data via gating mechanisms. Next, we shift focus to practical code implementation: building a simple LSTM model using Python with the TensorFlow/Keras framework—and demonstrating its application through a hands-on case study.
Implementing LSTM in Code
After reading “LSTM Code Implementation”, start by walking through a small, self-contained example end-to-end. Then assess which steps you can already execute independently.
At this point, consolidate “LSTM Code Implementation” into a structured retrospective table: first articulate the core workflow, then validate it against a minimal task.
Environment Setup
First, ensure the following libraries are installed:
pip install numpy pandas matplotlib tensorflow
Data Preparation
In this section, we use a synthetic time-series dataset as our case study—specifically, predicting the next value in a sequence. We generate and preprocess the data using numpy and pandas.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
# Generate synthetic time-series data
data = np.sin(np.arange(0, 100, 0.1)) + np.random.normal(scale=0.5, size=1000)
data = pd.DataFrame(data, columns=['value'])
# Normalize the data
scaler = MinMaxScaler(feature_range=(0, 1))
data['value'] = scaler.fit_transform(data['value'].values.reshape(-1, 1))
Dataset Splitting
To train the LSTM, we must reshape the time-series data into appropriate input format. Specifically, we use the past n_steps observations to predict the next value.
# Define number of time steps
n_steps = 10
def create_dataset(data, n_steps=1):
X, y = [], []
for i in range(len(data) - n_steps):
X.append(data[i:(i + n_steps), 0])
y.append(data[i + n_steps, 0])
return np.array(X), np.array(y)
X, y = create_dataset(data.values, n_steps)
X = X.reshape((X.shape[0], X.shape[1], 1)) # Reshape to match LSTM input requirements
Building the LSTM Model
Below is the Keras code to construct the LSTM model.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
# Build the LSTM model
model = Sequential()
model.add(LSTM(50, activation='relu', return_sequences=True, input_shape=(n_steps, 1)))
model.add(Dropout(0.2))
model.add(LSTM(50, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
Model Training
Now, train the constructed LSTM model.
When writing LSTM code, first confirm batch size, sequence length, feature dimensionality, and hidden-state dimension. Once tensor shapes align correctly, proceed to inspect loss function, optimizer, and training logs.
# Train the model
model.fit(X, y, epochs=200, verbose=1)
Making Predictions
After training, we use the model to make predictions. Here, we feed the last n_steps data points into the model and obtain its forecast.
# Make prediction
last_steps = data.values[-n_steps:].reshape((1, n_steps, 1))
predicted_value = model.predict(last_steps)
predicted_value = scaler.inverse_transform(predicted_value) # Reverse normalization
print("Predicted next value:", predicted_value[0][0])
Visualizing Results
Finally, visualize the prediction alongside actual values to evaluate model performance.
While learning “LSTM Code Implementation”, begin with a small, reproducible scenario you fully understand—then map related concepts and practice steps onto it. After reading, retell the entire process using your own example.
# Visualization
plt.plot(data.index[-100:], data.values[-100:], label='Actual Values')
plt.axvline(x=len(data) - n_steps - 1, color='r', linestyle='--', label='Prediction Start')
plt.scatter(len(data) - 1, predicted_value, color='g', label='Predicted Value')
plt.legend()
plt.title('LSTM Prediction Result')
plt.xlabel('Time')
plt.ylabel('Value')
plt.show()
Conclusion
In this article, we implemented an LSTM model from scratch using Keras, illustrated through a straightforward time-series forecasting task. The walkthrough vividly demonstrates how to build, train, and apply an LSTM network in practice. In upcoming articles, we’ll explore the architectural characteristics of BERT—stay tuned!
Continue