DrejcPesjak

Follow

Drejc Pesjak DrejcPesjak

Follow

🇸🇮 BitBucket repository: https://bitbucket.org/dpesjak/

9 followers · 31 following

https://drejcpesjak.github.io/website/

Achievements

Achievements

Highlights

Pro

DrejcPesjak/README.md

Today's AI News

AI Reddit Recap:

1. Latent Space Reasoning in LLMs:

Large Language Models (LLMs) can now perform reasoning in latent space, separating internal reasoning from context tokens.
This allows for better performance with smaller models, potentially revolutionizing AI.
Concerns exist about safety, transparency, and the alignability of such models.

2. AMD's AI Hardware Push:

AMD is working on a new Radeon RX 9070 XT GPU with 32GB memory, boosting AI performance.
Users debate the potential of ROCm as an alternative to CUDA, calling for greater community involvement.

3. Project Digits: Powerful AI Workstation:

Nvidia's new Project Digits workstation offers improved features for researchers and students.
Some criticize the lack of memory bandwidth information, raising speculation about its performance capabilities.

4. Phi-4's Unconventional Creativity:

Phi-Lthy4, a pruned version of Phi-4, demonstrates remarkable abilities in creative writing and roleplay.
Discussion revolves around the model's size, potential for merging with others, and its unique writing style.

Other Highlights:

OpenAI's roadmap update reveals plans for GPT-4.5 and GPT-5, with concerns about tiered access and automation.
DeepSearch, a valuable AI feature, is soon available for both Plus and Free users, sparking debate about its cost and accessibility.
xAI employee resigns after criticizing company's handling of internal information about its AI models.
OpenAI's new multimodal models can process both images and files, with rollout inconsistencies reported.

Pinned Loading

DPhate-double-paraphrasing-hate-speech DPhate-double-paraphrasing-hate-speech Public

Bachelor's thesis on removing hate from online comments using paraphrasing: algorithm DPhate

Python
scaling-monosemanticity-llama scaling-monosemanticity-llama Public

Reproducing Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet using LLaMA. This project explores monosemantic neurons in large language models, implementing and extend…

Jupyter Notebook 2
Herz-bot Herz-bot Public

A qlearning model for the card game called Herz.

Java
unbalanced-media unbalanced-media Public

Analysis of Unbalanced Slovenian Media News Outlets - Left vs. Right Wing

Python
weather-prediction-mlops weather-prediction-mlops Public

ML in the cloud project for the universtiy course Cloud Computin (RSO)

Jupyter Notebook
nyc-violation-tickets-analysis nyc-violation-tickets-analysis Public

Analysis and prediction of NYC violation tickets using big data and machine learning techniques.

Jupyter Notebook