My GitHub Projects 🚀

Clustering Algorithm Analysis

Description: Efficient pipelines are built for approximate nearest-neighbour search and k-means clustering on MNIST. Parts 1–2 invest in heavy index construction to accelerate queries directly in the raw \(28 \times 28\) pixel space via classic hashing and graph-based indices, whereas Part 3 first compresses images to a latent representation \(\lt50D\); timing, approximation, and Silhouette metrics are then compared across all parts.

Tech Stack: C/C++, Python, Jupyter Notebook, TensorFlow/Keras
Key Features:
- Vector Search and Clustering (LSH, Hypercube)
- Graph Nearest Neighbor Search (GNNs, MRNG, NSG)
- Dimensionality reduction via NN autoencoders

Implementing a Shell

Description: Implementation of mysh, a lightweight, Unix-like bash shell.

Tech Stack: C/C++
Key Features:
- I/O redirection
- Pipelines
- Background execution
- Wildcard expansion
- Alias management
- Signal handling
- Command history

Client-Server Model through TCP

Description: Implementation of a thread-pooled TCP poller server with a stress-testing client, ensuring safe concurrency through POSIX mutexes and condition variables.

Tech Stack: C/C++, Bash
Key Features:
- poller - Multithreaded C/C++ server that queues incoming sockets,
- pollSwayer - Multithreaded client that reads an input file and spawns one thread per voter

Data Mining Techniques: Customer Profiling & Goodreads Book Analysis

Description: Customers are segmented with Agglomerative and K-Means clustering for profile analysis, while cosine similarity on vectorized book descriptions supports a recommendation system.

Tech Stack: Python, Jupyter Notebook

Thanasis Trispiotis

Clustering Algorithm Analysis

Implementing a Shell

Client-Server Model through TCP

Data Mining Techniques: Customer Profiling & Goodreads Book Analysis