-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make OpenCL on Intel compile without opening device, working for A770 and Max 1100 #9462
base: master
Are you sure you want to change the base?
Conversation
…ed on target device
I'll leave this open, but it's clear to me this PR wasn't carefully read 3 times |
What do you exactly mean ? I am trying to contribute for the first time, so it is a bit tricky to understand all requirements and the context of the task. All I know is that the compilation with ocloc works. My problem is to answer the open question. I would appreciate it I you could give some requirements and hints on that bounty. As you did not close the PR, it should not be totally the wrong way. |
Bounty locked, but this code needs to sparkle before we can merge it. It's the right approach, yet things aren't in the right places, there's whitespace changes, there's a random , etc.... |
Changes
|
@amarmemic, some silly questions (from a fellow A770 owner):
tinygrad/tinygrad/runtime/ops_gpu.py Line 112 in 5662944
|
I did not any benchmarks but I think there should not be a difference in performance because both approaches use the same igc background (see second link above) No, you don't need to set anything, just using IntelOfflineCompiler class. But anyway, this PR will not be merged according to George from last meeting. |
Ok, thanks... I thought I had it configured, however I hadn't noticed any performance difference, so I was wondering about that. I fell down a bit of a rabbit hole recently trying to figure out why inference perf seems slow using tinygrad via exo. |
WIP:
Decided to integrate the intel opencl offline compiler
which uses as backend (maybe use it directly?)
but there is a multi-vendor alternative with clang
Contribution:
created python bindings for libocloc.so
implemented IntelOfflineCompiler class which uses ocloc in order to compile without opening device via opencl api
install libocloc.so + header api on target devices which run the unit tests to succeed (fixed package version)
fix :Error! Loading of FCL library has failed! Filename: libigdfcl.so.2
Error! FCL initialization failure. Error code = -6
Build failed with error code: -6
add unit tests
cleanup code + naming + correct file location
Open questions: