Advanced NumPy: Linear Algebra and Array Manipulation

Learning Objectives: After this lesson, you'll master NumPy's linear algebra capabilities, random number generation, array reshaping, and advanced indexing techniques essential for data science and machine learning.

Reshaping Arrays

Reshaping is fundamental to data manipulation—it lets you transform data between different dimensional representations without changing the underlying values.

Explore how arrays store elements and how operations like append and pop work:

Loading tool...

When you reshape this 12-element array into a (3,4) matrix, the elements are distributed row by row:

Loading tool...

Flatten and Ravel

Converting multi-dimensional arrays back to 1D:

Loading tool...

Transpose and Axis Swapping

Loading tool...

Stacking and Splitting Arrays

Stacking (Combining Arrays)

Loading tool...

Splitting Arrays

Loading tool...

Linear Algebra Operations

Linear algebra is the mathematical foundation of machine learning. NumPy provides comprehensive support. The diagram below shows common linear algebra workflows:

Loading tool...

Matrix Multiplication

Loading tool...

Dot Product of Vectors

Loading tool...

Matrix Properties

Loading tool...

Eigenvalues and Eigenvectors

Eigenvalues are crucial for understanding data transformations and dimensionality reduction (like PCA):

Loading tool...

Solving Linear Systems

Solving Ax = b is fundamental to many algorithms:

Loading tool...

Advanced Indexing

Using np.where()

np.where() is like a vectorized if-else statement:

Loading tool...

np.select() for Multiple Conditions

Loading tool...

np.clip() for Bounding Values

Loading tool...

Random Number Generation

NumPy's random module is essential for simulations, sampling, and machine learning. Here's a visualization of different distribution types:

Loading tool...

Random Number Generators

Loading tool...

Sampling and Shuffling

Loading tool...

Statistical Distributions

Loading tool...

Practical Example: Data Preprocessing Pipeline

Let's combine everything into a realistic data preprocessing example:

Loading tool...

Key Takeaways

✅ Reshaping transforms array dimensions with reshape(), flatten(), ravel(), and transpose

✅ Stacking and splitting combine or divide arrays with vstack(), hstack(), concatenate(), split()

✅ Linear algebra operations include matrix multiplication (@), inverse, determinant, eigenvalues, and solving systems

✅ Advanced indexing with np.where(), np.select(), np.clip() enables powerful conditional operations

✅ Random generation provides tools for sampling, shuffling, and various statistical distributions

✅ Data preprocessing pipelines combine these tools for real-world data preparation

Connections: Advanced NumPy in Practice

🔗 Connection to Machine Learning

ML Task	NumPy Operation
Feature scaling	Broadcasting + axis operations
PCA	Eigenvalue decomposition
Linear regression	Solving Ax = b
Train/test split	Shuffling + slicing
Data augmentation	Random transformations

🔗 Connection to Deep Learning

NumPy operations mirror neural network computations:

Forward pass: Matrix multiplications (@ operator)
Batch normalization: Mean/std along batch axis
Dropout: Random masking
Weight initialization: Random distributions

🔗 Connection to Statistics

Statistical Concept	NumPy Function
Covariance matrix	`np.cov()`
Correlation	`np.corrcoef()`
Monte Carlo simulation	Random sampling
Bootstrapping	`np.random.choice()` with replacement

Practice Exercises

Exercise 1: Matrix Operations

Loading tool...

Exercise 2: Data Simulation

Loading tool...

Next Steps

Now that you've mastered NumPy, you're ready for pandas—the library that builds on NumPy to provide intuitive data structures for real-world data analysis.

Ready to work with real datasets? Pandas is next!

Python for Data Science: From Arrays to Analysis

Advanced NumPy: Linear Algebra and Array Manipulation

Reshaping Arrays

Flatten and Ravel

Transpose and Axis Swapping

Stacking and Splitting Arrays

Stacking (Combining Arrays)

Splitting Arrays

Linear Algebra Operations

Matrix Multiplication

Dot Product of Vectors

Matrix Properties

Eigenvalues and Eigenvectors

Solving Linear Systems

Advanced Indexing

Using np.where()

np.select() for Multiple Conditions

np.clip() for Bounding Values

Random Number Generation

Random Number Generators

Sampling and Shuffling

Statistical Distributions

Practical Example: Data Preprocessing Pipeline

Key Takeaways

Connections: Advanced NumPy in Practice

🔗 Connection to Machine Learning

🔗 Connection to Deep Learning

🔗 Connection to Statistics

Practice Exercises

Exercise 1: Matrix Operations

Exercise 2: Data Simulation

Next Steps