Skip to content

Commit

Permalink
Rebase onto llvmorg-17.0.5
Browse files Browse the repository at this point in the history
  • Loading branch information
widberg committed Nov 27, 2023
1 parent 98bfdac commit ebf3008
Show file tree
Hide file tree
Showing 144 changed files with 2,486 additions and 168 deletions.
23 changes: 23 additions & 0 deletions .github/workflows/widberg-build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Widberg Build

on: [push, pull_request]

jobs:
build:
name: Build
runs-on: windows-latest
steps:
- name: Checkout
uses: actions/checkout@v3

- uses: ilammy/[email protected]
- name: Build
run: |
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD="X86"
cmake --build build --config RelWithDebInfo --target clang
- name: Archive Artifacts
uses: actions/upload-artifact@v3
with:
name: widberg-windows-x86_64-${{ github.sha }}
path: ./build/bin/clang-cl.exe
46 changes: 46 additions & 0 deletions .github/workflows/widberg-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
name: Widberg Release

on:
push:
tags:
- 'widberg-*'

jobs:
build:
name: Publish
runs-on: windows-latest
steps:
- name: Checkout
uses: actions/checkout@v3

- uses: ilammy/[email protected]
- name: Build
run: |
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD="X86"
cmake --build build --config RelWithDebInfo --target clang
- name: Package
run: |
cd build/bin/
7z a ../../bin.zip clang-cl.exe
- name: Create Release
id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
draft: false
prerelease: false

- name: Upload Archive to Release
uses: actions/upload-release-asset@v1
env:
GITHUB_TOKEN: ${{ github.token }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_name: ${{ github.ref_name }}-windows-x86_64-${{ github.sha }}.zip
asset_path: bin.zip
asset_content_type: application/zip
141 changes: 141 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,144 @@
# The LLVM Compiler Infrastructure With Widberg Extensions

The LLVM Compiler Infrastructure With Widberg Extensions, affectionately called
the Widpiler, is a fork of LLVM (17.0.5) intended to implement C/C++ language
features in LLVM/Clang to aid in reverse engineering. Currently, the
scope of this project covers a subset of the IDA Pro [__usercall
syntax](wiki/User‐Defined-Calling-Conventions) and [shifted pointers](wiki/Shifted-Pointers). This is a research project and not production ready.

[![Build Status](https://github.com/widberg/llvm-project-widberg-extensions/actions/workflows/widberg-build.yml/badge.svg?branch=main)](https://github.com/widberg/llvm-project-widberg-extensions/actions/workflows/widberg-build.yml)
[![GitHub release (latest SemVer)](https://img.shields.io/github/v/release/widberg/llvm-project-widberg-extensions)](https://github.com/widberg/llvm-project-widberg-extensions/releases)
[![Release Nightly](https://img.shields.io/badge/release-nightly-5e025f?labelColor=301934)](https://nightly.link/widberg/llvm-project-widberg-extensions/workflows/widberg-build/main)

## Example

An example of the syntax that is currently supported is as follows:

```cpp
// __usercall
// https://www.hex-rays.com/products/ida/support/idadoc/1361.shtml
long long __usercall __spoils<eax,esi>
square@<ebx:ecx>(long long num@<eax:edx>) {
return num * num;
}

bool __usercall is_even@<al>(int num) {
return num % 2 == 0;
}

void __userpurge is_odd(int num, bool &result@<eax>) {
result = num % 2 == 1;
}

auto is_odd_also = is_odd;

int *__usercall call_fn_ptr@<ebx>(int *(__usercall *x)@<eax>(long @<ecx>)@<edx>) {
return x(1337);
}

// __shifted
// https://hex-rays.com/products/ida/support/idadoc/1695.shtml
typedef struct vec3f {
float x;
float y;
float z;
} vec3f_t;

typedef struct player {
char name[16];
int health;
int armor;
int ammo;
vec3f_t pos;
} player_t;

const char *get_player_name_from_shifted_pos_pointer(const vec3f_t *__shifted(player_t, 0x1C) pos) {
return ADJ(pos)->name;
}
```
The first thing most people coming from MSVC say to me when I tell them
about this project is, "I won't have to do the __fastcall/__thiscall trick
anymore." What they don't know is that Clang already allows __thiscall on
non-member functions without this fork. For example the following is
acceptable in mainline Clang as well as this fork and produces the correct
output (_this in ecx, other args on the stack):
```cpp
int __thiscall square(void *_this, int num) {
return num * num;
}
```

## Compiler Explorer

The compiler is available on the [Compiler Explorer website](https://godbolt.org/z/j4dPsE8rq).

## Motivation

The goal of the project is not to recompile decomplied code in all cases, but rather to provide a familiar syntax for common patterns and reduce the amount of inline assembly and fiddling required when writing high-level patch code. However, by providing this syntax it is possible to recompile decompiled code in some cases. With the addition of [defs.h](#defs.h) most individual functions can be recompiled with little to no modification. Recompiling an entire binary will still require great effort, but is made easier by this project.

## Status

The project is semi-functional but lacks polish. Correct syntax will be accepted
and generate correct code in most cases; however, incorrect syntax is handled
largely by asserts and internal compiler errors, especially in X86_64. More work
needs to be done to take advantage of Clang's diagnostics infrastructure and
produce pretty errors rather than compiler stack traces. Additionally, some
incorrect syntax is accepted and ignored rather than reported. Currently, only
the X86_32 and X86_64 backends are supported.

Next steps are to improve the diagnostics reporting as described above and fix
the bugs. Pull requests and issues are encouraged; especially pull requests
adding tests.

## Enable and Disable the Extensions

By default, the extensions are enabled. They can be disabled using the
Clang option `-fno-widberg-extensions`.

## Verify Widberg Extensions Are Present

The following construct can be used in source files to verify that the
widberg extensions are present:

```cpp
#ifndef __has_feature
# define __has_feature(x) 0 // Compatibility with non-clang compilers.
#endif
#ifndef __has_extension
# define __has_extension __has_feature // Compatibility with pre-3.0 compilers.
#endif

#if !__has_extension(widberg)
# error "This file requires a compiler that implements the widberg extensions."
#endif
```

Also, the preprocessor macro `__widberg__` is predefined if the extensions are present.

## defs.h

An alternative implementation of defs.h from the Hex-Rays decompiler sdk intended to be used with this project can be found at https://github.com/widberg/widberg-defs. A lot of the stuff in there is overkill for writing patch code, but it is useful for recompiling decompiled code.

## Build

Use `x64 Native Tools Command Prompt`

```sh
cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_TARGETS_TO_BUILD="X86"
cmake --build build --config RelWithDebInfo --target clang
```

# Affiliation with LLVM (Or Lack Thereof)

This project is not affiliated with the LLVM project in any way.
This project, like the LLVM project, is under the Apache License
v2.0 with LLVM Exceptions. I have no intention of upstreaming any
of the changes made in this repository as I believe they are not
useful to most people. The original LLVM project README.md begins
below.

# The LLVM Compiler Infrastructure

Welcome to the LLVM project!
Expand Down
5 changes: 4 additions & 1 deletion clang/include/clang-c/Index.h
Original file line number Diff line number Diff line change
Expand Up @@ -2948,7 +2948,8 @@ enum CXTypeKind {

CXType_ExtVector = 176,
CXType_Atomic = 177,
CXType_BTFTagAttributed = 178
CXType_BTFTagAttributed = 178,
CXType_Shifted = 179,
};

/**
Expand Down Expand Up @@ -2976,6 +2977,8 @@ enum CXCallingConv {
CXCallingConv_AArch64VectorCall = 16,
CXCallingConv_SwiftAsync = 17,
CXCallingConv_AArch64SVEPCS = 18,
CXCallingConv_UserCall = 19,
CXCallingConv_UserPurge = 20,

CXCallingConv_Invalid = 100,
CXCallingConv_Unexposed = 200
Expand Down
4 changes: 4 additions & 0 deletions clang/include/clang/AST/ASTContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,7 @@ class ASTContext : public RefCountedBase<ASTContext> {
mutable llvm::FoldingSet<BitIntType> BitIntTypes;
mutable llvm::FoldingSet<DependentBitIntType> DependentBitIntTypes;
llvm::FoldingSet<BTFTagAttributedType> BTFTagAttributedTypes;
llvm::FoldingSet<ShiftedType> ShiftedTypes;

mutable llvm::FoldingSet<QualifiedTemplateName> QualifiedTemplateNames;
mutable llvm::FoldingSet<DependentTemplateName> DependentTemplateNames;
Expand Down Expand Up @@ -1595,6 +1596,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
QualType getBTFTagAttributedType(const BTFTypeTagAttr *BTFAttr,
QualType Wrapped);

QualType getShiftedType(const ShiftedAttr *SAttr,
QualType Wrapped);

QualType
getSubstTemplateTypeParmType(QualType Replacement, Decl *AssociatedDecl,
unsigned Index,
Expand Down
3 changes: 3 additions & 0 deletions clang/include/clang/AST/ASTNodeTraverser.h
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,9 @@ class ASTNodeTraverser
void VisitBTFTagAttributedType(const BTFTagAttributedType *T) {
Visit(T->getWrappedType());
}
void VisitShiftedType(const ShiftedType *T) {
Visit(T->getWrappedType());
}
void VisitSubstTemplateTypeParmType(const SubstTemplateTypeParmType *) {}
void
VisitSubstTemplateTypeParmPackType(const SubstTemplateTypeParmPackType *T) {
Expand Down
14 changes: 12 additions & 2 deletions clang/include/clang/AST/DeclBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ class Stmt;
class StoredDeclsMap;
class TemplateDecl;
class TemplateParameterList;
class WidbergLocation;
class TranslationUnitDecl;
class UsingDirectiveDecl;

Expand Down Expand Up @@ -282,6 +283,9 @@ class alignas(8) Decl {
/// Loc - The location of this decl.
SourceLocation Loc;

WidbergLocation *WidLoc = nullptr;
WidbergLocation *WidRetLoc = nullptr;

/// DeclKind - This indicates which class this is.
unsigned DeclKind : 7;

Expand Down Expand Up @@ -381,7 +385,7 @@ class alignas(8) Decl {
protected:
Decl(Kind DK, DeclContext *DC, SourceLocation L)
: NextInContextAndBits(nullptr, getModuleOwnershipKindForChildOf(DC)),
DeclCtx(DC), Loc(L), DeclKind(DK), InvalidDecl(false), HasAttrs(false),
DeclCtx(DC), Loc(L), WidLoc(nullptr), WidRetLoc(nullptr), DeclKind(DK), InvalidDecl(false), HasAttrs(false),
Implicit(false), Used(false), Referenced(false),
TopLevelDeclInObjCContainer(false), Access(AS_none), FromASTFile(0),
IdentifierNamespace(getIdentifierNamespaceForKind(DK)),
Expand All @@ -390,7 +394,7 @@ class alignas(8) Decl {
}

Decl(Kind DK, EmptyShell Empty)
: DeclKind(DK), InvalidDecl(false), HasAttrs(false), Implicit(false),
: WidLoc(nullptr), WidRetLoc(nullptr), DeclKind(DK), InvalidDecl(false), HasAttrs(false), Implicit(false),
Used(false), Referenced(false), TopLevelDeclInObjCContainer(false),
Access(AS_none), FromASTFile(0),
IdentifierNamespace(getIdentifierNamespaceForKind(DK)),
Expand Down Expand Up @@ -432,6 +436,12 @@ class alignas(8) Decl {
SourceLocation getLocation() const { return Loc; }
void setLocation(SourceLocation L) { Loc = L; }

WidbergLocation *getWidbergLocation() const { return WidLoc; };
void setWidbergLocation(WidbergLocation *WL) { WidLoc = WL; };

WidbergLocation *getWidbergReturnLocation() const { return WidRetLoc; };
void setWidbergReturnLocation(WidbergLocation *WL) { WidRetLoc = WL; };

Kind getKind() const { return static_cast<Kind>(DeclKind); }
const char *getDeclKindName() const;

Expand Down
Loading

0 comments on commit ebf3008

Please sign in to comment.