PYTHON FOR DATA SCIENCE: FROM ARRAYS TO ANALYSIS / L02ADVANCED NUMPY: LINEAR ALGEBRA AND ARRAY MANIPULATION
课程 · 10 · 02 / 10
LESSON 02 · INTERMEDIATE · 60 MIN · ◆ 4 INSTRUMENTS

Advanced NumPy: Linear Algebra and Array Manipulation

Dive deeper into NumPy with linear algebra operations, random number generation, advanced indexing, and array reshaping techniques.

TIP

Learning Objectives: After this lesson, you'll master NumPy's linear algebra capabilities, random number generation, array reshaping, and advanced indexing techniques essential for data science and machine learning.

Reshaping Arrays

Reshaping is fundamental to data manipulation—it lets you transform data between different dimensional representations without changing the underlying values.

Explore how arrays store elements and how operations like append and pop work:

FIG. 02Data Structure Visualizer
INTERACTIVE
LOADING INSTRUMENT
Fig. 02Interactive visualization of Python data structures

When you reshape this 12-element array into a (3,4) matrix, the elements are distributed row by row:

FIG. 04Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 04Interactive Python code execution environment

Flatten and Ravel

Converting multi-dimensional arrays back to 1D:

FIG. 06Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 06Interactive Python code execution environment

Transpose and Axis Swapping

FIG. 08Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 08Interactive Python code execution environment

Stacking and Splitting Arrays

Stacking (Combining Arrays)

FIG. 10Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 10Interactive Python code execution environment

Splitting Arrays

FIG. 12Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 12Interactive Python code execution environment

Linear Algebra Operations

Linear algebra is the mathematical foundation of machine learning. NumPy provides comprehensive support. The diagram below shows common linear algebra workflows:

FIG. 14Flow Diagram
INTERACTIVE
LOADING INSTRUMENT
Fig. 14Interactive flow diagrams, timelines, and process visualizations

Matrix Multiplication

FIG. 16Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 16Interactive Python code execution environment

Dot Product of Vectors

FIG. 18Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 18Interactive Python code execution environment

Matrix Properties

FIG. 20Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 20Interactive Python code execution environment

Eigenvalues and Eigenvectors

Eigenvalues are crucial for understanding data transformations and dimensionality reduction (like PCA):

FIG. 22Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 22Interactive Python code execution environment

Solving Linear Systems

Solving Ax = b is fundamental to many algorithms:

FIG. 24Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 24Interactive Python code execution environment

Advanced Indexing

Using np.where()

np.where() is like a vectorized if-else statement:

FIG. 26Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 26Interactive Python code execution environment

np.select() for Multiple Conditions

FIG. 28Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 28Interactive Python code execution environment

np.clip() for Bounding Values

FIG. 30Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 30Interactive Python code execution environment

Random Number Generation

NumPy's random module is essential for simulations, sampling, and machine learning. Here's a visualization of different distribution types:

FIG. 32Graph Plotter
INTERACTIVE
LOADING INSTRUMENT
Fig. 32Interactive plotting tool for visualizing data and relationships

Random Number Generators

FIG. 34Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 34Interactive Python code execution environment

Sampling and Shuffling

FIG. 36Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 36Interactive Python code execution environment

Statistical Distributions

FIG. 38Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 38Interactive Python code execution environment

Practical Example: Data Preprocessing Pipeline

Let's combine everything into a realistic data preprocessing example:

FIG. 40Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 40Interactive Python code execution environment

Key Takeaways

Reshaping transforms array dimensions with reshape(), flatten(), ravel(), and transpose

Stacking and splitting combine or divide arrays with vstack(), hstack(), concatenate(), split()

Linear algebra operations include matrix multiplication (@), inverse, determinant, eigenvalues, and solving systems

Advanced indexing with np.where(), np.select(), np.clip() enables powerful conditional operations

Random generation provides tools for sampling, shuffling, and various statistical distributions

Data preprocessing pipelines combine these tools for real-world data preparation

Connections: Advanced NumPy in Practice

🔗 Connection to Machine Learning

ML TaskNumPy Operation
Feature scalingBroadcasting + axis operations
PCAEigenvalue decomposition
Linear regressionSolving Ax = b
Train/test splitShuffling + slicing
Data augmentationRandom transformations

🔗 Connection to Deep Learning

NumPy operations mirror neural network computations:

  • Forward pass: Matrix multiplications (@ operator)
  • Batch normalization: Mean/std along batch axis
  • Dropout: Random masking
  • Weight initialization: Random distributions

🔗 Connection to Statistics

Statistical ConceptNumPy Function
Covariance matrixnp.cov()
Correlationnp.corrcoef()
Monte Carlo simulationRandom sampling
Bootstrappingnp.random.choice() with replacement

Practice Exercises

Exercise 1: Matrix Operations

FIG. 42Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 42Interactive Python code execution environment

Exercise 2: Data Simulation

FIG. 44Python Code Executor
INTERACTIVE
LOADING INSTRUMENT
Fig. 44Interactive Python code execution environment

Next Steps

Now that you've mastered NumPy, you're ready for pandas—the library that builds on NumPy to provide intuitive data structures for real-world data analysis.


Ready to work with real datasets? Pandas is next!


Further Reading

Visualize It

Official Docs

Performance & Modern Patterns

  • NumPy Performance Tips — how to actually get speedups.
  • Numba — JIT-compile NumPy-heavy functions to near-C speed with one decorator.
  • CuPy — drop-in NumPy on NVIDIA GPUs.

Tutorials & Books