Files
test-repo/IML Projects/Task 1b /task1b_ql4jfi6af0/template_solution.ipynb
2026-03-22 15:58:12 +01:00

254 lines
7.0 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### General guidance\n",
"\n",
"This serves as a template which will guide you through the implementation of this task. It is advised\n",
"to first read the whole template and get a sense of the overall structure of the code before trying to fill in any of the TODO gaps.\n",
"This is the jupyter notebook version of the template. For the python file version, please refer to the file `template_solution.py`.\n",
"\n",
"First, we import necessary libraries:"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.169924Z",
"start_time": "2026-03-15T18:22:13.165934Z"
}
},
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"# Add any additional imports here (however, the task is solvable without using \n",
"# any additional imports)\n",
"# import ..."
],
"outputs": [],
"execution_count": 30
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" #### Loading data"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.190312Z",
"start_time": "2026-03-15T18:22:13.181357Z"
}
},
"source": [
"data = pd.read_csv(\"train.csv\")\n",
"y = data[\"y\"].to_numpy()\n",
"data = data.drop(columns=[\"Id\", \"y\"])\n",
"# print a few data samples\n",
"print(data.head())\n",
"X = data.to_numpy()"
],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" x1 x2 x3 x4 x5\n",
"0 0.02 0.05 -0.09 -0.43 -0.08\n",
"1 -0.13 0.11 -0.08 -0.29 -0.03\n",
"2 0.08 0.06 -0.07 -0.41 -0.03\n",
"3 0.02 -0.12 0.01 -0.43 -0.02\n",
"4 -0.14 -0.12 -0.08 -0.02 -0.08\n"
]
}
],
"execution_count": 31
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Transform features"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.201976Z",
"start_time": "2026-03-15T18:22:13.198370Z"
}
},
"source": [
"\"\"\"\n",
"Transform the 5 input features of matrix X (x_i denoting the i-th component of a given row in X) \n",
"into 21 new features phi(X) in the following manner:\n",
"5 linear features: phi_1(X) = x_1, phi_2(X) = x_2, phi_3(X) = x_3, phi_4(X) = x_4, phi_5(X) = x_5\n",
"5 quadratic features: phi_6(X) = x_1^2, phi_7(X) = x_2^2, phi_8(X) = x_3^2, phi_9(X) = x_4^2, phi_10(X) = x_5^2\n",
"5 exponential features: phi_11(X) = exp(x_1), phi_12(X) = exp(x_2), phi_13(X) = exp(x_3), phi_14(X) = exp(x_4), phi_15(X) = exp(x_5)\n",
"5 cosine features: phi_16(X) = cos(x_1), phi_17(X) = cos(x_2), phi_18(X) = cos(x_3), phi_19(X) = cos(x_4), phi_20(X) = cos(x_5)\n",
"1 constant feature: phi_21(X)=1\n",
"\n",
"Parameters\n",
"----------\n",
"X: matrix of floats, dim = (700,5), inputs with 5 features\n",
"\n",
"Compute\n",
"----------\n",
"X_transformed: matrix of floats: dim = (700,21), transformed input with 21 features\n",
"\"\"\"\n",
"X_transformed = np.zeros((700, 21))\n",
"quadratic_features = np.power(X,2)\n",
"exponential_features = np.exp(X)\n",
"cosine_features = np.cos(X)\n",
"constant_feature = np.ones((X.shape[0],1))\n",
"X_transformed = np.concatenate((X, quadratic_features, exponential_features, cosine_features, constant_feature), axis=1)\n",
"assert X_transformed.shape == (700, 21)"
],
"outputs": [],
"execution_count": 32
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Fit data"
]
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.625674Z",
"start_time": "2026-03-15T18:22:13.217438Z"
}
},
"source": [
"\"\"\"\n",
"Use the transformed data points X_transformed and fit the logistic regression on this \n",
"transformed data. Finally, compute the weights of the fitted logistic regression. \n",
"\n",
"Parameters\n",
"----------\n",
"X_transformed: array of floats: dim = (700,21), transformed input with 21 features\n",
"y: array of integers \\in {0,1}, dim = (700,), input labels\n",
"\n",
"Compute\n",
"----------\n",
"w: array of floats: dim = (21,), optimal parameters of logistic regression\n",
"\"\"\"\n",
"weights = np.zeros((21,))\n",
"learning_rate = 2 * X.shape[0] / np.linalg.svd(X_transformed, compute_uv=False)[0]**2\n",
"tolerance = 0.001\n",
"sigma = lambda x : 1/(1+np.exp(-x))\n",
"gradient = lambda w, X, y: X.T @ (sigma(X @ w) - y) / X.shape[0]\n",
"update = 1000000\n",
"while np.linalg.norm(update) > tolerance:\n",
" # Select a random batch (SGD)\n",
" selection = np.random.choice(X_transformed.shape[0], 100, replace=False)\n",
" X_random = X_transformed[selection,:]\n",
" update = learning_rate * gradient(weights, X_random, y[selection])\n",
" weights -= update\n",
"\n",
"assert weights.shape == (21,)"
],
"outputs": [],
"execution_count": 33
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.630314Z",
"start_time": "2026-03-15T18:22:13.629075Z"
}
},
"cell_type": "code",
"source": "",
"outputs": [],
"execution_count": null
},
{
"cell_type": "code",
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.636358Z",
"start_time": "2026-03-15T18:22:13.633885Z"
}
},
"source": [
"# Save results in the required format\n",
"np.savetxt(\"./results.csv\", weights, fmt=\"%.12f\")"
],
"outputs": [],
"execution_count": 34
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.643898Z",
"start_time": "2026-03-15T18:22:13.641041Z"
}
},
"cell_type": "code",
"source": [
"matrix = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])\n",
"np.linalg.svd(X_transformed,compute_uv=False)\n",
"matrix[np.random.choice(matrix.shape[0], 1, replace=False), :]"
],
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2]])"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 35
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2026-03-15T18:22:13.707327Z",
"start_time": "2026-03-15T18:22:13.706165Z"
}
},
"cell_type": "code",
"source": "",
"outputs": [],
"execution_count": null
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}