PathToPerformance

I'm trying to get a GSoC 2024 in LLVM

and I will be documenting my work with this ongoing blogpost in reverse chronological order.


If you want to see more posts like this, consider chucking a buck or two on my GitHub sponsors, or, you know, hire me as a grad student.


23/02/2024

Only added stubs for where ISD::[US]CMP will go in the LegalizerDAG.cpp. Some code is better than no code!

22/02/2024

Back from exams. I let my mentors know I wouldn't be around for a few and would come back later.

Omg Nori taught me how to use vim marks (even within VSCode). ma will set a mark a and 'a lets you jump back to it. If you set an uppercase one, it's GLOBAL and you can jump between files!

19/02/2024

Got feedback for the proposal. Seems people like not reading ChatGPT scripts, whodathunkit. Aggregated suggestions and sent it off. Should update if I get more updates.

Also, rebased the next step of the PR, which is merging into the SelectionDAG

Alright, pushed some code that probably fails on everything but can't get more brainpower on learning the casting RTTI APIs today.

18/02/2024

IT GOT MERGED!

I wrote up the draft proposal, and sent it for feedback from the mentors.

Tomorrow I'll try to get started on the next steps of the proposal and send a dummy PR to get to the good stuff.

17/02/2024

Actually just rested yesterday. 'Twas good. Today I'll do some light reading.

enum { Foo, Bar, Baz } X = foo();

switch (X) {
  case Foo: /* Handle Foo */; break;
  case Bar: /* Handle Bar */; break;
  default:
    llvm_unreachable("X should be Foo or Bar here");
}
LLVM_DEBUG(dbgs() << "I am here!\n");

and then

$ opt < a.bc > /dev/null -mypass
<no output>
$ opt < a.bc > /dev/null -mypass -debug
I am here!

16/02/2024

Start: read LLVM Programmer's Manual, then start with some local changes to the SelectionDAG.

15/02/2024

Derp, I kept getting "undefined intrinsics" errors when running FileCheck like

> ../../../build/bin/opt -S -passes=verify 2>&1 intrinsic-cmp.ll | FileCheck intrinsic-cmp.ll

... because I had to rebuild LLVM and have the new opt pick it up. Once that happened, even the new llvm-lit passed :D

I also had to remove the CHECK-LABEL, still not sure why. Lemme read that for a bit.

Also, I should update mold - it failed with an unknown --long-plt option.

Here's my LLVM build instruction:

> cmake -S . -B build -G Ninja -DLLVM_TARGETS_TO_BUILD=X86 -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Debug -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_USE_NEWPM=ON -DLLVM_USE_SPLIT_DWARF=ON -DLLVM_PARALLEL_LINKER_JOBS=10 -DLLVM_PARALLEL_COMPILE_JOBS=10
> mold -run ninja -C .

Windows builders take freakin' forever to run.

// Typically there's no reason to copy.
for (const auto &Val : Container) observe(Val);
for (auto &Val : Container) Val.change();

// Remove the reference if you really want a new copy.
for (auto Val : Container) { Val.change(); saveSomewhere(Val); }

// Copy pointers, but make it clear that they're pointers.
for (const auto *Ptr : Container) observe(*Ptr);
for (auto *Ptr : Container) Ptr->change();

* use early returns, specially in loops:

// not this 
for (Instruction &I : BB) {
  if (auto *BO = dyn_cast<BinaryOperator>(&I)) {
    Value *LHS = BO->getOperand(0);
    Value *RHS = BO->getOperand(1);
    if (LHS != RHS) {
      ...
    }
  }
}
// prefer this 
for (Instruction &I : BB) {
  auto *BO = dyn_cast<BinaryOperator>(&I);
  if (!BO) continue;

  Value *LHS = BO->getOperand(0);
  Value *RHS = BO->getOperand(1);
  if (LHS == RHS) continue;

  ...
}

* other good tips for no *else* after a *return* * turn predicate loops into functions * assert liberally! you may find other people's bugs! assert(Ty->isPointerType() && "Can't allocate a non-pointer type!"); * consider llvm::unreachable("invalid radix for integer literal") * never using namespace std * don't evaluate end() every time through a loop!

// not this
BasicBlock *BB = ...
for (auto I = BB->begin(); I != BB->end(); ++I)
  ... use I ...

// prefer this 
BasicBlock *BB = ...
for (auto I = BB->begin(), E = BB->end(); I != E; ++I)
  ... use I ...

* lol #include <iostream> is forbidden, as well as std::endl * prefer preincrement (++x) * anonymous namespaces allow more aggressive optimizations! make them small and only use them for class declarations * omit braces on short if's -_- * read "Effective C++" by Scott Meyers or "Large Scale C++ Software Design" by John Lakos. later.

BIG ALERT! We're passing all tests and Nikita says "LGTM!" Huzzahhh!

I finished reading the Coding Standards and will now read the LLVM Programmer's Manual.

Afternoon: only 1 reviewer missing approval!

14/02/2024

It's close to finishingggggggggg

Nikita mentioned the CodingStandars, I'll give that a read in a bit.

In FileCheck unit tests, it's just 2 spaces, not tabs or 4 spaces.

Dhruv recommends using git clang-format or Format Modified Lines in VSCode.

I only have to make FileCheck pass now and Nikita says it should be good!

13/02/2024

OK, I just saw that both Nikita Popov and Dhruv Chawla are listed as mentors for the project. Should try and reach out before the application period begins.

Nikita commented that there's two good places to look into (which I should do meanwhile):

Notes on the SelectionDAG

After a few hours of noodling with PR reviews, it's clear I gotta jump on a next one/help PR review some other people. Maybe it's not great to go full mercenary on every good-first-issue and try and help others along the way.

ParticleSwarmOptimization was kind enough in the LLVM/beginners Discord chat to point out that I should have a full build of LLVM and then run llvm-lit from there on a unit test. Sure enough, that let me do gh pr checkout 84903 and then LIT llvm/test/CodeGen/AArch64/hadd-combine.ll to check that a unit test properly ran. Now I can help verify my own and other people's PRs! yay!

That seems like enough for today, I'll hunt for some good issues tomorrow.

12/02/2024

Did a bunch of very quick fixes and got some neat feedback.

11/02/2024

Ohhhhhh neat! Got some really fast feedback this time around on stray newlines and a couple fixes on the PR. Things are moving faster and faster! I think I'm only missing adding some FileCheck tests for the invariants under the Verifier.cpp stuff that isn't handled by Intrinsics.td but that sounds about it!

FileCheck reading I'll do for a bit.

Note: If you are going to check multiple things in a single file, use a CHECK-LABEL to avoid spurious matches.

Sometimes, a CHECK-NEXT is useful as the last unit test.

Learned the LLVMIR verbose call syntax goes (gotta include types in call site).

That should do it for today, see y'all tomorrow.

10/02/2024

Plan for today: Read the IR Verifier, add code to it if needed, then work on adding to SDAG node.

IR Verifier code: It's under llvm/IR/Verifier.h and llvm/IR/Verifier.cpp.

Ok, read the code for a bit. What I'm interested in in Verifier.cpp is the huge switch statement towards the end of the file where the verification of intrinsics happens.

Oh neat, learned a lot of places that probably need modification via searching case Intrinsic:: in the llvm folder.

Neat! VectorUtils.cpp looks like where we add SIMD :D

There is also a SelectionDAGBuilder.cpp that seems like I should update stuff there.

Ah, seems the SDAG needs the ID node to be added to the ISDOpcodes first, like ISD::SCMP and such.

Intrinsic Function: start with llvm. prefix. Must always be external functions. If any are added, must be documented in LangRef. Have naming convention on type name return.

In TargetLowering.h:

/// This class defines information used to lower LLVM code to legal SelectionDAG
/// operators that the target instruction selector can accept natively.
///
/// This class also defines callbacks that targets must implement to lower
/// target-specific constructs to SelectionDAG operators.
class TargetLowering : public TargetLoweringBase {

Updated the PR to not include a dirty file from the ValueTracking.cpp unit tests.

Sweet, rebased the spaceship-intrinsic branch changes into a new one.

Now I can go for

lib/CodeGen/SelectionDAG/SelectionDAG.cpp:

Add code to print the node to getOperationName. If your new node can be
evaluated at compile time when given constant arguments (such as an add of a constant with another constant), find the getNode method that takes the appropriate number of arguments, and add a case for your node to the switch statement that performs constant folding for nodes that take the same number of arguments as your new node.

Which gives on Line 6629 of SelectionDAG.cpp

SDValue SelectionDAG::getNode(unsigned Opcode, const SDLoc &DL, EVT VT,
                              SDValue N1, SDValue N2, const SDNodeFlags Flags) {

where I can add an case ISD::UCMP/case ISD::SCMP.

Oh wow, having VSCode let me just hover over a type and give me info is amazing.

Alright, did some good work today I think.


Got some feedback on typo fixes from other people and some concrete directions from Nikita:

def int_scmp : DefaultAttrsIntrinsic<
    [llvm_anyint_ty], [llvm_anyint_ty, LLVMMatchType<1>],
    [IntrNoMem, IntrSpeculatable, IntrWillReturn]>;

for the Intrinsics.td so that we can have a "result type overload over a fixed type" and then

and then add a check in Verifier.cpp that a) the return type and the first argument type have the same number of elements (if they are vectors) and b) the return scalar type has width at least 2.

So I've pushed a commit already to have the Intrinsics.td that way and I will now add a case to the Verifier.cpp in the void Verifier::visitIntrinsicCall(Intrinsic::ID ID, CallBase &Call) { to add

(Many hours later) Put in a good first effort into the Verifier.cpp cases for Intrinsic::scmp/ucmp. It's probably a bit boneheaded but I've been getting really good and fast feedback on the PRs, so I feel encouraged to keep the hot streak going.

09/02/2024

Lost the renaming battle to standard practices. Can't complain. Updated the PR to reflect that.

Now, reading the DCE pass code.

Ok, Andrzej Warzyński did an incredibly useful tutorial for llvm-tutor.

I'll try writing the Transformation pass, since it's closes to what I need.

Notes: * LLVM_DEBUG is super useful.

#include "llvm/ADT/Statistic.h"
#include "llvm/Support/Debug.h"

#define DEBUG_TYPE "mba-add"
STATISIC(SubstCount, "The # of substituted instructions"

and then you can do:

LLVM_DEBUG(dbgs() << *BinOp << " -> " << *NewInst << "\n";
    // or , with Statistic and `-stat` on the `opt` CLI and debug build
    ++SubstCount;
for (auto &Func : M) {
    for (auto &BB : Func) {
        for (auto &Ins : BB) {
            ...
lldb -- $LLVM_DIR/bin/opt -load-pass-plugin lib/libMyPass.so -passes=my-pass -S dummy.ll
(lldb) b MyPass::run 
(lldb) r

Finally: started a branch called scmp-and-ucmp and I'll start trying out changes there whilst people make up their minds.

8/02/2024

Jyn strikes again and sometimes you can just do make after cmake because some stuff is optional.

Otherwise, install:

sudo apt install libzstd-dev libcurl4-openssl-dev libedit-dev

to get rolling.

Once you do

# Run the pass
$LLVM_DIR/bin/opt -load-pass-plugin ./libHelloWorld.{so|dylib} -passes=hello-world -disable-output input_for_hello.ll
# Expected output
(llvm-tutor) Hello from: foo
(llvm-tutor)   number of arguments: 1
(llvm-tutor) Hello from: bar
(llvm-tutor)   number of arguments: 2
(llvm-tutor) Hello from: fez
(llvm-tutor)   number of arguments: 3
(llvm-tutor) Hello from: main
(llvm-tutor)   number of arguments: 2

using -disable-output means no bitcode gets produced.

Passes come in 3 flavors, mostly: Analysis, Transformations and CFG manipulations.

Note that clang adds the optnone function attribute if 1) no opt level is specified or -O0 is specified.

Ah, forgot to run the cmake .. -> mold -run make -j and had some passes missing. Derp.

Minutes lost to cmake bullshit: 60.

Oh sweet! I just learned that I can write an injection pass that will give me a new binary and it will print out cool analysis info.

Also, if I get an instrumented binary I can just use lli to interpret the .ll file directly.

You can also build a static binary that will run that analysis for you!

Ran a bunch of passes with opt and friends. Transformation passes will normally inherit from PassInfoMixin.

Analysis Passes will inherit from AnalysisInfoMixin.

Ok, cool I an outdated example in llvm-tutor in the examples and sent a PR for it.

Tomorrow I shall dive into those optimization passes at the end and hopefully run some good lit tests.

7/02/2024

Alright, did some good catchup on the semester and pushed the PR forward a bit.

I also had an awesome "uniwtting tourist" genius moment by pointing out the return type could be different than what Nikita had thought of.

Today I setup my environment with

wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | sudo apt-key add -
sudo apt-add-repository "deb http://apt.llvm.org/jammy/ llvm-toolchain-jammy-17 main"
sudo apt-get update
sudo apt-get install -y llvm-17 llvm-17-dev llvm-17-tools clang-17

so now I can run opt shenanigans without awkward path invocations.

2/02/2024

I'm waiting for others to chime in on my latest fix so I will be reading on how to add unit tests.

Cool, that took like 25 minutes and I pushed a commit into the unit test generation framework.

Took me a bit but tried adding using llvm-lit and couldn't get the Examples folder to run, so I posted a question in the #beginners channel about it.

1/02/2024

Sweet! I was able to address Nikita's refactoring comments without too much hassle.

I guess the next task is to add it to the Verifier.

-> I read the llvm/lib/IR/Verifier.cpp header and grep'd for smin. Seems I can add a

Intrinsics::vector_reduce_sthreecmp:
Intrinsics::vector_reduce_uthreecmp:

on line 5378 and get away with this. I don't see other places where it's defined.

I think I've hit my potential here. I will go do some tutorials.

29/02/2024

"hazlo cobarde"

Add the 3 way comparison instruction <=> to LLVM.

I like this GSoC in particular because

Task 1: Add to LangRef

Add a new intrinsic - Langref, then Intrinsics.td, then maybe the pass verifier.

I've already put up a sample PR and got redirected on what looks like the proper working path for this endeavour.

Oh neat, I've finished. Only took about 1 hour with careful copy pasting. I'm probably blundering the return type being iM bits, but someone will correct me.

Now, I need to add an entry of the intrinsic into TableGen.

Task 2: Add to Intrinsics.td

Whelp, I guess I gotta learn tablegen and then this Intrinsics.td file.

Ok, I found a non-intimidating TableGen overview.

Gotta find the optimization wizardry in there after reading for a few minutes.

Alright, took more than a bit of an hour but I got pushed something for this task.

Pinged Nikita to get some adults to look at my horrible TableGen incantation.

See ya tomorrow.