From e61f8f9fc4b8cd2cc33610968648648f1aea3b19 Mon Sep 17 00:00:00 2001 From: Vacodwave <854108540@qq.com> Date: Thu, 13 Apr 2023 10:52:08 +0800 Subject: [PATCH] Fixed table formatting --- .../default-37a8.jupyterlab-workspace | 1 + ...Model_Representation_Soln-checkpoint.ipynb | 428 ++++++++++++++++++ ...1_W1_Lab03_Model_Representation_Soln.ipynb | 38 +- 3 files changed, 463 insertions(+), 4 deletions(-) create mode 100644 .jupyter/desktop-workspaces/default-37a8.jupyterlab-workspace create mode 100644 Supervised Machine Learning Regression and Classification/week1/4.Regression Model/.ipynb_checkpoints/C1_W1_Lab03_Model_Representation_Soln-checkpoint.ipynb diff --git a/.jupyter/desktop-workspaces/default-37a8.jupyterlab-workspace b/.jupyter/desktop-workspaces/default-37a8.jupyterlab-workspace new file mode 100644 index 00000000..9e22bd55 --- /dev/null +++ b/.jupyter/desktop-workspaces/default-37a8.jupyterlab-workspace @@ -0,0 +1 @@ +{"data":{"layout-restorer:data":{"main":{"dock":{"type":"tab-area","currentIndex":1,"widgets":["notebook:Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb"]},"current":"notebook:Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb"},"down":{"size":0,"widgets":[]},"left":{"collapsed":false,"current":"filebrowser","widgets":["filebrowser","running-sessions","@jupyterlab/toc:plugin","extensionmanager.main-view"]},"right":{"collapsed":true,"widgets":["jp-property-inspector","debugger-sidebar"]},"relativeSizes":[0.1874852418803901,0.8125147581196098,0]},"file-browser-filebrowser:cwd":{"path":"Supervised Machine Learning Regression and Classification/week1/4.Regression Model"},"notebook:Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb":{"data":{"path":"Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb","factory":"Notebook"}}},"metadata":{"id":"default"}} \ No newline at end of file diff --git a/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/.ipynb_checkpoints/C1_W1_Lab03_Model_Representation_Soln-checkpoint.ipynb b/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/.ipynb_checkpoints/C1_W1_Lab03_Model_Representation_Soln-checkpoint.ipynb new file mode 100644 index 00000000..36696471 --- /dev/null +++ b/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/.ipynb_checkpoints/C1_W1_Lab03_Model_Representation_Soln-checkpoint.ipynb @@ -0,0 +1,428 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "2b6f2c5c", + "metadata": {}, + "source": [ + "# Optional Lab: Model Representation\n", + "\n", + "
\n", + " \n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "343b5176", + "metadata": {}, + "source": [ + "## Goals\n", + "In this lab you will:\n", + "- Learn to implement the model $f_{w,b}$ for linear regression with one variable" + ] + }, + { + "cell_type": "markdown", + "id": "d494e04b", + "metadata": {}, + "source": [ + "## Notation\n", + "Here is a summary of some of the notation you will encounter. \n", + "\n", + "|General
Notation | Description | Python (if applicable) |\n", + "| ------------| ------------------------------------------------------------| ------------- |\n", + "| $a$ | scalar, non bold ||\n", + "| $\\mathbf{a}$ | vector, bold ||\n", + "| **Regression** | | | |\n", + "| $\\mathbf{x}$ | Training Example feature values (in this lab - Size (1000 sqft)) | `x_train` | \n", + "| $\\mathbf{y}$ | Training Example targets (in this lab Price (1000s of dollars)). | `y_train` \n", + "| $x^{(i)}$, $y^{(i)}$ | $i_{th}$Training Example | `x_i`, `y_i`|\n", + "| m | Number of training examples | `m`|\n", + "| $w$ | parameter: weight, | `w` |\n", + "| $b$ | parameter: bias | `b` | \n", + "| $f_{w,b}(x^{(i)})$ | The result of the model evaluation at $x^{(i)}$ parameterized by $w,b$: $f_{w,b}(x^{(i)}) = wx^{(i)}+b$ | `f_wb` | \n" + ] + }, + { + "cell_type": "markdown", + "id": "ffb63f1f", + "metadata": {}, + "source": [ + "## Tools\n", + "In this lab you will make use of: \n", + "- NumPy, a popular library for scientific computing\n", + "- Matplotlib, a popular library for plotting data" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "95eae0da", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "plt.style.use('./deeplearning.mplstyle')" + ] + }, + { + "cell_type": "markdown", + "id": "9be6114c", + "metadata": {}, + "source": [ + "# Problem Statement\n", + " \n", + "\n", + "As in the lecture, you will use the motivating example of housing price prediction. \n", + "This lab will use a simple data set with only two data points - a house with 1000 square feet(sqft) sold for \\\\$300,000 and a house with 2000 square feet sold for \\\\$500,000. These two points will constitute our *data or training set*. In this lab, the units of size are 1000 sqft and the units of price are 1000s of dollars.\n", + "\n", + "| Size (1000 sqft) | Price (1000s of dollars) |\n", + "| -------------------| ------------------------ |\n", + "| 1.0 | 300 |\n", + "| 2.0 | 500 |\n", + "\n", + "You would like to fit a linear regression model (shown above as the blue straight line) through these two points, so you can then predict price for other houses - say, a house with 1200 sqft.\n" + ] + }, + { + "cell_type": "markdown", + "id": "55852a14", + "metadata": {}, + "source": [ + "Please run the following code cell to create your `x_train` and `y_train` variables. The data is stored in one-dimensional NumPy arrays." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13f851e5", + "metadata": {}, + "outputs": [], + "source": [ + "# x_train is the input variable (size in 1000 square feet)\n", + "# y_train is the target (price in 1000s of dollars)\n", + "x_train = np.array([1.0, 2.0])\n", + "y_train = np.array([300.0, 500.0])\n", + "print(f\"x_train = {x_train}\")\n", + "print(f\"y_train = {y_train}\")" + ] + }, + { + "cell_type": "markdown", + "id": "c29de428", + "metadata": {}, + "source": [ + ">**Note**: The course will frequently utilize the python 'f-string' output formatting described [here](https://docs.python.org/3/tutorial/inputoutput.html) when printing. The content between the curly braces is evaluated when producing the output." + ] + }, + { + "cell_type": "markdown", + "id": "fd16d9a3", + "metadata": {}, + "source": [ + "### Number of training examples `m`\n", + "You will use `m` to denote the number of training examples. Numpy arrays have a `.shape` parameter. `x_train.shape` returns a python tuple with an entry for each dimension. `x_train.shape[0]` is the length of the array and number of examples as shown below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "34941f09", + "metadata": {}, + "outputs": [], + "source": [ + "# m is the number of training examples\n", + "print(f\"x_train.shape: {x_train.shape}\")\n", + "m = x_train.shape[0]\n", + "print(f\"Number of training examples is: {m}\")" + ] + }, + { + "cell_type": "markdown", + "id": "d632fae4", + "metadata": {}, + "source": [ + "One can also use the Python `len()` function as shown below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c4607c4f", + "metadata": {}, + "outputs": [], + "source": [ + "# m is the number of training examples\n", + "m = len(x_train)\n", + "print(f\"Number of training examples is: {m}\")" + ] + }, + { + "cell_type": "markdown", + "id": "e37fcde7", + "metadata": {}, + "source": [ + "### Training example `x_i, y_i`\n", + "\n", + "You will use (x$^{(i)}$, y$^{(i)}$) to denote the $i^{th}$ training example. Since Python is zero indexed, (x$^{(0)}$, y$^{(0)}$) is (1.0, 300.0) and (x$^{(1)}$, y$^{(1)}$) is (2.0, 500.0). \n", + "\n", + "To access a value in a Numpy array, one indexes the array with the desired offset. For example the syntax to access location zero of `x_train` is `x_train[0]`.\n", + "Run the next code block below to get the $i^{th}$ training example." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "e5399aac", + "metadata": {}, + "outputs": [], + "source": [ + "i = 0 # Change this to 1 to see (x^1, y^1)\n", + "\n", + "x_i = x_train[i]\n", + "y_i = y_train[i]\n", + "print(f\"(x^({i}), y^({i})) = ({x_i}, {y_i})\")" + ] + }, + { + "cell_type": "markdown", + "id": "15c12a8c", + "metadata": {}, + "source": [ + "### Plotting the data" + ] + }, + { + "cell_type": "markdown", + "id": "b1be049e", + "metadata": {}, + "source": [ + "You can plot these two points using the `scatter()` function in the `matplotlib` library, as shown in the cell below. \n", + "- The function arguments `marker` and `c` show the points as red crosses (the default is blue dots).\n", + "\n", + "You can use other functions in the `matplotlib` library to set the title and labels to display" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5fc69a55", + "metadata": {}, + "outputs": [], + "source": [ + "# Plot the data points\n", + "plt.scatter(x_train, y_train, marker='x', c='r')\n", + "# Set the title\n", + "plt.title(\"Housing Prices\")\n", + "# Set the y-axis label\n", + "plt.ylabel('Price (in 1000s of dollars)')\n", + "# Set the x-axis label\n", + "plt.xlabel('Size (1000 sqft)')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "31e27c47", + "metadata": {}, + "source": [ + "## Model function\n", + "\n", + " As described in lecture, the model function for linear regression (which is a function that maps from `x` to `y`) is represented as \n", + "\n", + "$$ f_{w,b}(x^{(i)}) = wx^{(i)} + b \\tag{1}$$\n", + "\n", + "The formula above is how you can represent straight lines - different values of $w$ and $b$ give you different straight lines on the plot.




\n", + "\n", + "Let's try to get a better intuition for this through the code blocks below. Let's start with $w = 100$ and $b = 100$. \n", + "\n", + "**Note: You can come back to this cell to adjust the model's w and b parameters**" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a24ebd1a", + "metadata": {}, + "outputs": [], + "source": [ + "w = 100\n", + "b = 100\n", + "print(f\"w: {w}\")\n", + "print(f\"b: {b}\")" + ] + }, + { + "cell_type": "markdown", + "id": "be869ab6", + "metadata": {}, + "source": [ + "Now, let's compute the value of $f_{w,b}(x^{(i)})$ for your two data points. You can explicitly write this out for each data point as - \n", + "\n", + "for $x^{(0)}$, `f_wb = w * x[0] + b`\n", + "\n", + "for $x^{(1)}$, `f_wb = w * x[1] + b`\n", + "\n", + "For a large number of data points, this can get unwieldy and repetitive. So instead, you can calculate the function output in a `for` loop as shown in the `compute_model_output` function below.\n", + "> **Note**: The argument description `(ndarray (m,))` describes a Numpy n-dimensional array of shape (m,). `(scalar)` describes an argument without dimensions, just a magnitude. \n", + "> **Note**: `np.zero(n)` will return a one-dimensional numpy array with $n$ entries \n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ed7f9ec2", + "metadata": {}, + "outputs": [], + "source": [ + "def compute_model_output(x, w, b):\n", + " \"\"\"\n", + " Computes the prediction of a linear model\n", + " Args:\n", + " x (ndarray (m,)): Data, m examples \n", + " w,b (scalar) : model parameters \n", + " Returns\n", + " y (ndarray (m,)): target values\n", + " \"\"\"\n", + " m = x.shape[0]\n", + " f_wb = np.zeros(m)\n", + " for i in range(m):\n", + " f_wb[i] = w * x[i] + b\n", + " \n", + " return f_wb" + ] + }, + { + "cell_type": "markdown", + "id": "7526c8e7", + "metadata": {}, + "source": [ + "Now let's call the `compute_model_output` function and plot the output.." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "12f7764f", + "metadata": {}, + "outputs": [], + "source": [ + "tmp_f_wb = compute_model_output(x_train, w, b,)\n", + "\n", + "# Plot our model prediction\n", + "plt.plot(x_train, tmp_f_wb, c='b',label='Our Prediction')\n", + "\n", + "# Plot the data points\n", + "plt.scatter(x_train, y_train, marker='x', c='r',label='Actual Values')\n", + "\n", + "# Set the title\n", + "plt.title(\"Housing Prices\")\n", + "# Set the y-axis label\n", + "plt.ylabel('Price (in 1000s of dollars)')\n", + "# Set the x-axis label\n", + "plt.xlabel('Size (1000 sqft)')\n", + "plt.legend()\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "id": "e07f2f91", + "metadata": {}, + "source": [ + "As you can see, setting $w = 100$ and $b = 100$ does *not* result in a line that fits our data. \n", + "\n", + "### Challenge\n", + "Try experimenting with different values of $w$ and $b$. What should the values be for a line that fits our data?\n", + "\n", + "#### Tip:\n", + "You can use your mouse to click on the triangle to the left of the green \"Hints\" below to reveal some hints for choosing b and w." + ] + }, + { + "cell_type": "markdown", + "id": "64af9d27", + "metadata": {}, + "source": [ + "
\n", + "\n", + " Hints\n", + "\n", + "

\n", + "

\n", + "

" + ] + }, + { + "cell_type": "markdown", + "id": "8bdbe7dd", + "metadata": {}, + "source": [ + "### Prediction\n", + "Now that we have a model, we can use it to make our original prediction. Let's predict the price of a house with 1200 sqft. Since the units of $x$ are in 1000's of sqft, $x$ is 1.2.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1678db16", + "metadata": {}, + "outputs": [], + "source": [ + "w = 200 \n", + "b = 100 \n", + "x_i = 1.2\n", + "cost_1200sqft = w * x_i + b \n", + "\n", + "print(f\"${cost_1200sqft:.0f} thousand dollars\")" + ] + }, + { + "cell_type": "markdown", + "id": "c68e9b44", + "metadata": {}, + "source": [ + "# Congratulations!\n", + "In this lab you have learned:\n", + " - Linear regression builds a model which establishes a relationship between features and targets\n", + " - In the example above, the feature was house size and the target was house price\n", + " - for simple linear regression, the model has two parameters $w$ and $b$ whose values are 'fit' using *training data*.\n", + " - once a model's parameters have been determined, the model can be used to make predictions on novel data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "764f68d9", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.16" + }, + "toc-autonumbering": false + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb b/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb index 749774a8..36696471 100644 --- a/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb +++ b/Supervised Machine Learning Regression and Classification/week1/4.Regression Model/C1_W1_Lab03_Model_Representation_Soln.ipynb @@ -2,6 +2,7 @@ "cells": [ { "cell_type": "markdown", + "id": "2b6f2c5c", "metadata": {}, "source": [ "# Optional Lab: Model Representation\n", @@ -13,6 +14,7 @@ }, { "cell_type": "markdown", + "id": "343b5176", "metadata": {}, "source": [ "## Goals\n", @@ -22,13 +24,14 @@ }, { "cell_type": "markdown", + "id": "d494e04b", "metadata": {}, "source": [ "## Notation\n", "Here is a summary of some of the notation you will encounter. \n", "\n", - "|General
Notation | Description| Python (if applicable) |\n", - "|: ------------|: ------------------------------------------------------------||\n", + "|General
Notation | Description | Python (if applicable) |\n", + "| ------------| ------------------------------------------------------------| ------------- |\n", "| $a$ | scalar, non bold ||\n", "| $\\mathbf{a}$ | vector, bold ||\n", "| **Regression** | | | |\n", @@ -43,6 +46,7 @@ }, { "cell_type": "markdown", + "id": "ffb63f1f", "metadata": {}, "source": [ "## Tools\n", @@ -54,6 +58,7 @@ { "cell_type": "code", "execution_count": null, + "id": "95eae0da", "metadata": {}, "outputs": [], "source": [ @@ -64,6 +69,7 @@ }, { "cell_type": "markdown", + "id": "9be6114c", "metadata": {}, "source": [ "# Problem Statement\n", @@ -82,6 +88,7 @@ }, { "cell_type": "markdown", + "id": "55852a14", "metadata": {}, "source": [ "Please run the following code cell to create your `x_train` and `y_train` variables. The data is stored in one-dimensional NumPy arrays." @@ -90,6 +97,7 @@ { "cell_type": "code", "execution_count": null, + "id": "13f851e5", "metadata": {}, "outputs": [], "source": [ @@ -103,6 +111,7 @@ }, { "cell_type": "markdown", + "id": "c29de428", "metadata": {}, "source": [ ">**Note**: The course will frequently utilize the python 'f-string' output formatting described [here](https://docs.python.org/3/tutorial/inputoutput.html) when printing. The content between the curly braces is evaluated when producing the output." @@ -110,6 +119,7 @@ }, { "cell_type": "markdown", + "id": "fd16d9a3", "metadata": {}, "source": [ "### Number of training examples `m`\n", @@ -119,6 +129,7 @@ { "cell_type": "code", "execution_count": null, + "id": "34941f09", "metadata": {}, "outputs": [], "source": [ @@ -130,6 +141,7 @@ }, { "cell_type": "markdown", + "id": "d632fae4", "metadata": {}, "source": [ "One can also use the Python `len()` function as shown below." @@ -138,6 +150,7 @@ { "cell_type": "code", "execution_count": null, + "id": "c4607c4f", "metadata": {}, "outputs": [], "source": [ @@ -148,6 +161,7 @@ }, { "cell_type": "markdown", + "id": "e37fcde7", "metadata": {}, "source": [ "### Training example `x_i, y_i`\n", @@ -161,6 +175,7 @@ { "cell_type": "code", "execution_count": null, + "id": "e5399aac", "metadata": {}, "outputs": [], "source": [ @@ -173,6 +188,7 @@ }, { "cell_type": "markdown", + "id": "15c12a8c", "metadata": {}, "source": [ "### Plotting the data" @@ -180,6 +196,7 @@ }, { "cell_type": "markdown", + "id": "b1be049e", "metadata": {}, "source": [ "You can plot these two points using the `scatter()` function in the `matplotlib` library, as shown in the cell below. \n", @@ -191,6 +208,7 @@ { "cell_type": "code", "execution_count": null, + "id": "5fc69a55", "metadata": {}, "outputs": [], "source": [ @@ -207,6 +225,7 @@ }, { "cell_type": "markdown", + "id": "31e27c47", "metadata": {}, "source": [ "## Model function\n", @@ -225,6 +244,7 @@ { "cell_type": "code", "execution_count": null, + "id": "a24ebd1a", "metadata": {}, "outputs": [], "source": [ @@ -236,6 +256,7 @@ }, { "cell_type": "markdown", + "id": "be869ab6", "metadata": {}, "source": [ "Now, let's compute the value of $f_{w,b}(x^{(i)})$ for your two data points. You can explicitly write this out for each data point as - \n", @@ -252,6 +273,7 @@ { "cell_type": "code", "execution_count": null, + "id": "ed7f9ec2", "metadata": {}, "outputs": [], "source": [ @@ -274,6 +296,7 @@ }, { "cell_type": "markdown", + "id": "7526c8e7", "metadata": {}, "source": [ "Now let's call the `compute_model_output` function and plot the output.." @@ -282,6 +305,7 @@ { "cell_type": "code", "execution_count": null, + "id": "12f7764f", "metadata": {}, "outputs": [], "source": [ @@ -305,6 +329,7 @@ }, { "cell_type": "markdown", + "id": "e07f2f91", "metadata": {}, "source": [ "As you can see, setting $w = 100$ and $b = 100$ does *not* result in a line that fits our data. \n", @@ -318,6 +343,7 @@ }, { "cell_type": "markdown", + "id": "64af9d27", "metadata": {}, "source": [ "
\n", @@ -333,6 +359,7 @@ }, { "cell_type": "markdown", + "id": "8bdbe7dd", "metadata": {}, "source": [ "### Prediction\n", @@ -342,6 +369,7 @@ { "cell_type": "code", "execution_count": null, + "id": "1678db16", "metadata": {}, "outputs": [], "source": [ @@ -355,6 +383,7 @@ }, { "cell_type": "markdown", + "id": "c68e9b44", "metadata": {}, "source": [ "# Congratulations!\n", @@ -368,6 +397,7 @@ { "cell_type": "code", "execution_count": null, + "id": "764f68d9", "metadata": {}, "outputs": [], "source": [] @@ -375,7 +405,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -389,7 +419,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.6" + "version": "3.8.16" }, "toc-autonumbering": false },