DOTNEXT Moscow 2015

Generics under the hood

and a JITter bug for dessert

by Alexandr Nikitin

About me:

@nikitin_a_a

work @Adform

https://alexandrnikitin.github.io/blog

I 💘 OSS: NSubstitute, ☑ CoreFX, ☑ CoreCLR

Technical stuff:

1M HTTP RPS < 5ms

2 billions of events per day

8 billions of user profiles

Overview

  • Compare with JAVA and C++
  • Find out .NET is awesome
  • Object memory layout
  • Generics under the hood
  • The dessert - a bug in JITter
  • Discuss some interesting things???

JAVA and swear words

“Šuo ir kariamas pripranta.”

Lithuanian proverbs

Generics in .NET

awesom-o

Well known benefits

  • Reduce code duplication
  • Constraints: class, struct, new(), implementations
  • Inheritance via variance
  • Performance improvements (no boxing, no casting)
  • Compile-time checks

.NET memory layout


var o = new object();
                            

Instance on heap:

Sync block address
Method Table address
Field1
FieldN

Method Table

  • The central data structure of the runtime
  • "Hot" data

EEClass address
Interface Map Table address
Inherited Virtual Method addresses
Introduced Virtual Method addresses
Instance Method addresses
Static Method addresses
Static Fields values
InterfaceN method addresses
Other

EEClass

  • Everything about the type
  • Data needed by type loading, JITing or reflection
  • "Cold" data
  • Too complex: contains EEClassOptionalFields, EEClassPackedFields

WinDbg the Great and Powerful

SOS (Son of Strike)

SOSEX (SOS extensions)

An example class:


public class MyClass
{
    private int _myField;

    public int MyMethod()
    {
        return _myField;
    }
}
                            

An instance:


var myClass = new MyClass();
                                    


0:003> !DumpHeap -type GenericsUnderTheHood.MyClass
Address          MT                     Size
0000004a2d912de8 00007fff8e7540d8       24

Statistics:
MT                      Count       TotalSize   Class Name
00007fff8e7540d8        1           24          GenericsUnderTheHood.MyClass
Total 1 objects
                                    

Method Table looks like:

method-table

EEClass looks like:

eeclass

Links:

WinDbg links

SOS & SOSEX

HOWTO: Debugging .NET with WinDbg

Article: .NET internals

Book: "Pro .NET Performance" by Sasha Goldshtein, Dima Zurbalev, Ido Flatow

Generics memory layout

An example class:


public class MyGenericClass<T>
{
    private T _myField;

    public T MyMethod()
    {
        return _myField;
    }
}
                            

Compiles to:

il-template

Miracle! CLR knows about Generics

An instance:


                            var myObject = new MyGenericClass<object>();
                        

Method Table:

method-table-generic-object

EEClass:

eeclass-generic-object

An instance:


                            var myString = new MyGenericClass<string>();
                        

Method Table:

method-table-generic-string

An instance:


                            var myInt = new MyGenericClass<int>();
                        

Method Table:

method-table-generic-int

EEClass:

eeclass-generic-int

Under the hood

awesom-o
  • Value types don't share anything
  • Code bloat ⚠
  • Reference types share code and EEClass
  • Method Table ≡ Type in Runtime
  • System.__Canon - an internal type

Crucial part:

Unknown type -> Runtime handle lookup


public class MyGenericClass<T>
{
    public void MyMethod()
    {
        new T();
    }
}
                            

Optimizations by CLR

  1. Class loader ( 30x )
  2. Global cache lookup ( 6x )
  3. Global cache lookup (no type hierarchy walk) ( 3x )
  4. Method Table slots ( 1x )

Links:

Design and Implementation of Generics for the .NET Common Language Runtime

About generics in CoreCLR documentation

Optimizations explained by a core developer

Dessert

Simplified version:


public class BaseClass<T>
{
    private List<T> list = new List<T>();

    public BaseClass()
    {
        foreach (var _ in list) { }
    }

    public void Run()
    {
        for (var i = 0; i < 11; i++)
            if (list.Any(_ => true))
                return;
    }
}

public class DerivedClass : BaseClass<object> { }
                            

Benchmark? BenchmarkDotNet! The source


[BenchmarkTask(platform: BenchmarkPlatform.X86, jitVersion: BenchmarkJitVersion.LegacyJit)]
[BenchmarkTask(platform: BenchmarkPlatform.X64, jitVersion: BenchmarkJitVersion.LegacyJit)]
[BenchmarkTask(platform: BenchmarkPlatform.X64, jitVersion: BenchmarkJitVersion.RyuJit)]
public class Jit_GenericsMethod
{
    // ... BaseClass and DerivedClass ...

    private BaseClass<object> baseClass = new BaseClass<object>();
    private BaseClass<object> derivedClass = new DerivedClass();

    [Benchmark]
    public void Base()
    {
        baseClass.Run();
    }

    [Benchmark]
    public void Derived()
    {
        derivedClass.Run(); // 3-5 times slower
    }
}
                            

Workaround:

"Just add two methods"


public class BaseClass<T>
{
    //...

    public void Method1()
    {
    }

    public void Method2()
    {
    }
}
                            

The fix

The fix
The pull request on github

Moral

Links:

Stackoverflow question

An issue on github

Great explanation of the CLR core developer

The pull request with the fix

Interesting things

Heuristic algorithm


DWORD numMethodsAdjusted =
    (bmtMethod->dwNumDeclaredNonAbstractMethods == 0)
    ? 0
    : (bmtMethod->dwNumDeclaredNonAbstractMethods < 3)
    ? 3
    : bmtMethod->dwNumDeclaredNonAbstractMethods;

DWORD nTypeFactorBy2 = (bmtGenerics->GetNumGenericArgs() == 1)
                       ? 2
                       : 3;

DWORD estNumTypeSlots = (numMethodsAdjusted * nTypeFactorBy2 + 2) / 3;
                            

Sources

Dictionary lookup


CORINFO_GENERIC_HANDLE
JIT_GenericHandleWorker(
    MethodDesc *  pMD,
    MethodTable * pMT,
    LPVOID        signature)
{
     CONTRACTL {
        THROWS;
        GC_TRIGGERS;
    } CONTRACTL_END;

    MethodTable * pDeclaringMT = NULL;

    if (pMT != NULL)
    {
        SigPointer ptr((PCCOR_SIGNATURE)signature);

        ULONG kind; // DictionaryEntryKind
        IfFailThrow(ptr.GetData(&kind));

        // We need to normalize the class passed in (if any) for reliability purposes. That's because preparation of a code region that
        // contains these handle lookups depends on being able to predict exactly which lookups are required (so we can pre-cache the
        // answers and remove any possibility of failure at runtime). This is hard to do if the lookup (in this case the lookup of the
        // dictionary overflow cache) is keyed off the somewhat arbitrary type of the instance on which the call is made (we'd need to
        // prepare for every possible derived type of the type containing the method). So instead we have to locate the exactly
        // instantiated (non-shared) super-type of the class passed in.

        ULONG dictionaryIndex = 0;
        IfFailThrow(ptr.GetData(&dictionaryIndex));

        pDeclaringMT = pMT;
        for (;;)
        {
            MethodTable * pParentMT = pDeclaringMT->GetParentMethodTable();
            if (pParentMT->GetNumDicts() <= dictionaryIndex)
                break;
            pDeclaringMT = pParentMT;
        }

        if (pDeclaringMT != pMT)
        {
            JitGenericHandleCacheKey key((CORINFO_CLASS_HANDLE)pDeclaringMT, NULL, signature);
            HashDatum res;
            if (g_pJitGenericHandleCache->GetValue(&key,&res))
            {
                // Add the denormalized key for faster lookup next time. This is not a critical entry - no need
                // to specify appdomain affinity.
                JitGenericHandleCacheKey denormKey((CORINFO_CLASS_HANDLE)pMT, NULL, signature);
                AddToGenericHandleCache(&denormKey, res);
                return (CORINFO_GENERIC_HANDLE) (DictionaryEntry) res;
            }
        }
    }

    DictionaryEntry * pSlot;
    CORINFO_GENERIC_HANDLE result = (CORINFO_GENERIC_HANDLE)Dictionary::PopulateEntry(pMD, pDeclaringMT, signature, FALSE, &pSlot);

    if (pSlot == NULL)
    {
        // If we've overflowed the dictionary write the result to the cache.
        BaseDomain *pDictDomain = NULL;

        if (pMT != NULL)
        {
            pDictDomain = pDeclaringMT->GetDomain();
        }
        else
        {
            pDictDomain = pMD->GetDomain();
        }

        // Add the normalized key (pDeclaringMT) here so that future lookups of any
        // inherited types are faster next time rather than just just for this specific pMT.
        JitGenericHandleCacheKey key((CORINFO_CLASS_HANDLE)pDeclaringMT, (CORINFO_METHOD_HANDLE)pMD, signature, pDictDomain);
        AddToGenericHandleCache(&key, (HashDatum)result);
    }

    return result;
}
                                

Sources

TODO

Generic method:


public class MyClassWithGenericMethod
{
    public T MyGenericMethod<T>(T arg)
    {
        return arg;
    }
}
                        

Generic struct:


public struct MyGenericStruct<T>
{
    private T _myField;

    public T MyMethod()
    {
        return _myField;
    }
}
                        

THE END