-
Notifications
You must be signed in to change notification settings - Fork 639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Debugging.AssertsEnabled #346
Comments
Changed here: be2e7bd |
Changing to a Lazy variable improves it but still quite heavy for that - would be better to fully remove from there, any thoughts @NightOwl888 ? |
this seems expensive, @NightOwl888, any reason its not in compilation flag? |
The only place I saw this being set so far is under Lucene.Net.Index.CheckIndex, so I added a conditional compilation there too: 5a14eaa#diff-3f750f45c88050fc4fbd6a09ee4c638cR54 But I have unloaded some projects (VS was too slow with everything loaded), so might be missing something... |
See #326 for an explanation of this. We need this feature for the tests to work right in Release mode and for other features, such as CheckIndex to work properly. The designers of Lucene decided to make the asserts into test conditions, so without them we are literally skipping tests. |
I am open to discussing alternatives to
If there are any alternatives, we should also consider how much work they are to implement and how difficult they would be to keep in sync with Lucene's asserts. I don't particularly care for Lucene's test design being partially implemented inside the released code and realize it does come at a performance cost in production, but I didn't see any viable alternatives where we (and end users) could turn on asserts in Release mode. The impact was minimized as much as possible by using a field instead of a property and by using private DocumentsWriterPerThread InternalTryCheckOutForFlush(ThreadState perThread)
{
if (Debugging.AssertsEnabled)
{
// LUCENENET specific - Since we need to mimic the unfair behavior of ReentrantLock, we need to ensure that all threads that enter here hold the lock.
Debugging.Assert(perThread.IsHeldByCurrentThread);
Debugging.Assert(Monitor.IsEntered(this));
Debugging.Assert(perThread.flushPending);
}
try
{
// LUCENENET specific - We removed the call to perThread.TryLock() and the try-finally below as they are no longer needed.
// We are pending so all memory is already moved to flushBytes
if (perThread.IsInitialized)
{
if (Debugging.AssertsEnabled) Debugging.Assert(perThread.IsHeldByCurrentThread);
DocumentsWriterPerThread dwpt;
long bytes = perThread.bytesUsed; // do that before
// replace!
dwpt = perThreadPool.Reset(perThread, closed);
if (Debugging.AssertsEnabled) Debugging.Assert(!flushingWriters.ContainsKey(dwpt), "DWPT is already flushing");
// Record the flushing DWPT to reduce flushBytes in doAfterFlush
flushingWriters[dwpt] = bytes;
numPending--; // write access synced
return dwpt;
}
return null;
}
finally
{
UpdateStallState();
}
} This design choice was made primarily because it keeps close parity with Lucene's source code and design choices. When it comes down to the choice between better performance and ability to detect bugs, we went with ability to detect bugs since we cannot have both without a major redesign. |
Couldn't we just add a conditional compilation flag for when compiling the tests projects? Instead of a |
End users who need to test their extensions in the compiled release will need to enable the asserts. The test framework requires asserts to be enabled in order to run all of the conditions and throw all of the expected exceptions. In addition, the The main issue is the Lucene designers made no distinction between the Not to mention, what are we actually testing if we rebuild after we test, and how do we know the final build does what is expected? Your results might actually be skewed because you are seeing the initial impact of loading up the |
Think I found a better solution for this case that keeps the current assert logic, but removes it from tight loops: #347 I've wrapped Debugging.AssertsEnabled around a lazy - but I'm not sure if there is any other way of setting it after lucene is loaded. I could only find this one in CheckIndex: // LUCENENET specific - rather than having the user specify whether to enable asserts, we always run with them enabled.
Debugging.AssertsEnabled = true; I've the impression that only EnvironmentVariablesConfigurationProvider implements IConfigurationProvider, and thus it would only be read once from the current environment variables. |
CheckIndex is wrapped into a dotnet tool,
That is the default setting when running This differs a bit from the state of things in .NET - in Java, the property values can actually be read inside of the application. We did the next best thing, which was to use .NET configuration providers to read the settings from outside of the application. There aren't a lot of settings that are actually meaningful in Lucene.NET, but enabling asserts is one of the biggest ones. Of course, by default the setting is missing in the provider and after it loads and realizes it is missing, it uses the second parameter as the default setting. |
I understand its complicated, truly I am. but why should we stick to the java bad pattern implementation? we all want highly performant search library, and for production use, I really don't want/need overhead debug assertions. 10% performance penalty isn't wroth to be java compliance in this case I strongly feels that move this to #IF TESTING is better option, for production use, we'll compile this without and get the best from both worlds. |
@eladmarg i think in this case specific we could attack the few places that were calling this in a loop and just get rid of the overhead. @NightOwl888 from what I understand, in any case using IConfigurationFactory with custom providers, that would be set once at the start of the program, and not changed afterwards right? From a quick glance, it doesn't seem that Properties support live-reload of configuration in any case. So if it's cached on the Lazy or within the more complex System.Properties code, it is cached - which should make this PR a non-breaking change... |
Where do you see a 10% performance penalty? Running a basic search using the SearchFiles demo project as a template (see #310) shows virtually no impact in search speed and about a 1.5% increase in RAM usage. On the other hand, the IndexFiles benchmark has about a 4% decrease in performance and a 23% increase in RAM consumption. Run your own benchmarks on 4.8.0-beta00012 vs 4.8.0-beta00011, I am interested to see if you are getting different results. The problem here is that there seems to be a lot of focus on the initial request that loads the
We just migrated from what is effectively that approach (using
I will tell you exactly why - unless we run all of the test conditions (and make no mistake about it, the asserts are intended to be test conditions), we have no way of realistically determining if the application is functioning correctly as it was designed. |
Tried on a very simple benchmark - running on a laptop so don't expect perfect results... In any case I think the gain is obviouos :) [SimpleJob(RuntimeMoniker.NetCoreApp31)]
public class DebuggingFlags
{
private static Dictionary<string, string> _flags = new Dictionary<string, string>() { ["assert"] = "false" };
private HashSet<int> _set = new HashSet<int>();
[Params(1000, 10000, 100_000)]
public int N;
[Benchmark]
public void Loop()
{
for (int i = 0; i < N; i++)
{
Set(i);
}
}
[Benchmark]
public void CachedLoop()
{
bool flag = AssertsEnabled;
for (int i = 0; i < N; i++)
{
SetPassingFlag(i, flag);
}
}
internal static bool AssertsEnabled
{
get
{
return _flags.TryGetValue("assert", out var flag) ? bool.Parse(flag) : false;
}
}
private void Set(int index)
{
SetPassingFlag(index, AssertsEnabled);
}
private void SetPassingFlag(int index, bool flag)
{
if (flag) Assert(index > 0 && index < N);
_set.Add(index);
}
private static void Assert(bool condition)
{
if (condition) throw new Exception("Failed assertion");
}
}
|
I have spotted and corrected an error in the benchmark. It looks like the following line: public static bool AssertsEnabled = SystemProperties.GetPropertyAsBoolean("assert", false); was interpreted as... public static bool AssertsEnabled => SystemProperties.GetPropertyAsBoolean("assert", false); Subtle, but critical. With that change, the difference is far less than the margin of error. [SimpleJob(RuntimeMoniker.NetCoreApp31)]
[MemoryDiagnoser]
public class DebuggingFlags
{
private static Dictionary<string, string> _flags = new Dictionary<string, string>() { ["assert"] = "false" };
private HashSet<int> _set = new HashSet<int>();
[Params(1000, 10000, 100_000)]
public int N;
[Benchmark]
public void Loop()
{
for (int i = 0; i < N; i++)
{
Set(i);
}
}
[Benchmark]
public void CachedLoop()
{
bool flag = AssertsEnabled;
for (int i = 0; i < N; i++)
{
SetPassingFlag(i, flag);
}
}
internal static bool AssertsEnabled = _flags.TryGetValue("assert", out var flag) ? bool.Parse(flag) : false;
private void Set(int index)
{
SetPassingFlag(index, AssertsEnabled);
}
private void SetPassingFlag(int index, bool flag)
{
if (flag) Assert(index > 0 && index < N);
_set.Add(index);
}
private static void Assert(bool condition)
{
if (condition) throw new Exception("Failed assertion");
}
} BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1016 (1909/November2018Update/19H2)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.1.301
[Host] : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
.NET Core 3.1 : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
Job=.NET Core 3.1 Runtime=.NET Core 3.1
I guess another issue is that the dictionary doesn't contain an I will try running benchmarks on #347 tomorrow and report the results. |
True, I must have totally misread that line. I had the impression it was
always calling the System Properties method. Then the cost that was showing
in the profiler must have been only due to the field access outside of the
inner method, not from the actual getter, so we can remove the change to
the Debugging file
…On Wed, Sep 23, 2020, 7:10 PM Shad Storhaug ***@***.***> wrote:
I have spotted and corrected an error in the benchmark.
It looks like the following line:
public static bool AssertsEnabled = SystemProperties.GetPropertyAsBoolean("assert", false);
was interpreted as...
public static bool AssertsEnabled => SystemProperties.GetPropertyAsBoolean("assert", false);
Subtle, but critical. With that change, the difference is far less than
the margin of error.
[SimpleJob(RuntimeMoniker.NetCoreApp31)]
[MemoryDiagnoser]
public class DebuggingFlags
{
private static Dictionary<string, string> _flags = new Dictionary<string, string>() { ["assert"] = "false" };
private HashSet<int> _set = new HashSet<int>();
[Params(1000, 10000, 100_000)]
public int N;
[Benchmark]
public void Loop()
{
for (int i = 0; i < N; i++)
{
Set(i);
}
}
[Benchmark]
public void CachedLoop()
{
bool flag = AssertsEnabled;
for (int i = 0; i < N; i++)
{
SetPassingFlag(i, flag);
}
}
internal static bool AssertsEnabled = _flags.TryGetValue("assert", out var flag) ? bool.Parse(flag) : false;
private void Set(int index)
{
SetPassingFlag(index, AssertsEnabled);
}
private void SetPassingFlag(int index, bool flag)
{
if (flag) Assert(index > 0 && index < N);
_set.Add(index);
}
private static void Assert(bool condition)
{
if (condition) throw new Exception("Failed assertion");
}
}
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.1016 (1909/November2018Update/19H2)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.1.301
[Host] : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
.NET Core 3.1 : .NET Core 3.1.5 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.27001), X64 RyuJIT
Job=.NET Core 3.1 Runtime=.NET Core 3.1
Method N Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
*Loop* *1000* *13.55 μs* *0.152 μs* *0.127 μs* *-* *-* *-* *-*
CachedLoop 1000 13.55 μs 0.217 μs 0.203 μs - - - -
*Loop* *10000* *137.09 μs* *2.668 μs* *3.911 μs* *-* *-* *-* *-*
CachedLoop 10000 136.20 μs 2.403 μs 2.248 μs - - - -
*Loop* *100000* *1,367.65 μs* *27.162 μs* *40.655 μs* *-* *-* *-* *-*
CachedLoop 100000 1,365.65 μs 22.746 μs 27.934 μs - - - -
I guess another issue is that the dictionary doesn't contain an "assert"
element by default, so the bool.Parse() should also be taken out of the
equation. But since that all happens the first time the class is accessed
by anything, it really makes no difference to the benchmark. The only
difference is that in Lucene.NET, the first call to load the dictionary
based on environment variables is a bit more expensive than just having a
static dictionary.
I will try running benchmarks on #347
<#347> tomorrow and report the
results.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#346 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACDCOAYMDNZM2ZXOE6GEFS3SHITXHANCNFSM4RV2BDBA>
.
|
While it might not be possible to determine which asserts were specifically meant for the features we need to enable for end users to use all of Lucene.NET's features, there are certain asserts that we can definitely rule out as having no benefit to end users. In those cases, we can reduce the impact of this feature by going back to compiling out of the release by using private DocumentsWriterPerThread InternalTryCheckOutForFlush(ThreadState perThread)
{
if (Debugging.AssertsEnabled)
{
// LUCENENET specific - Since we need to mimic the unfair behavior of ReentrantLock, we need to ensure that all threads that enter here hold the lock.
Debugging.Assert(perThread.IsHeldByCurrentThread);
Debugging.Assert(Monitor.IsEntered(this));
Debugging.Assert(perThread.flushPending);
}
try
{
// LUCENENET specific - We removed the call to perThread.TryLock() and the try-finally below as they are no longer needed.
// We are pending so all memory is already moved to flushBytes
if (perThread.IsInitialized)
{
if (Debugging.AssertsEnabled) Debugging.Assert(perThread.IsHeldByCurrentThread);
DocumentsWriterPerThread dwpt;
long bytes = perThread.bytesUsed; // do that before
// replace!
dwpt = perThreadPool.Reset(perThread, closed);
if (Debugging.AssertsEnabled) Debugging.Assert(!flushingWriters.ContainsKey(dwpt), "DWPT is already flushing");
// Record the flushing DWPT to reduce flushBytes in doAfterFlush
flushingWriters[dwpt] = bytes;
numPending--; // write access synced
return dwpt;
}
return null;
}
finally
{
UpdateStallState();
}
} In these cases, the end user has no control over the conditions that are being checked - they are internal state of The same can probably be said about many more of the asserts. We should start with I am working on getting a nightly build set up so we can move the burden of testing edge cases and invariants such as these out of the normal workflow. While all of the features in both the test framework and the Azure Pipelines templates are already implemented for nightly builds, some of the tests were designed with longer runs than the 1 hour limit of Azure DevOps in mind, so adjustments to the nightly test limits need to be made to keep it from timing out. |
@NightOwl888 I'm reviewing my previous benchmarks on our usage of Lucene today, and I think my initial conclusion on Debugging.AssertsEnabled being the cause of the "slowness" of FixedBitSet.Set / Get is actually incorrect. Using dotMemory now to measure memory allocations, one can clearly see that there is a lambda capture being allocated on every call to the FixedBitSet.Set / Get methods. This can also be seen on SharpLab if we look at the decompiled C# code: public class FixedBitSet
{
[CompilerGenerated]
private sealed class <>c__DisplayClass3_0
{
public int index;
public FixedBitSet <>4__this;
internal string <Set>b__0()
{
return "index=" + index + ", numBits=" + <>4__this.numBits;
}
}
internal readonly long[] bits;
internal readonly int numBits;
internal readonly int numWords;
public void Set(int index)
{
<>c__DisplayClass3_0 <>c__DisplayClass3_ = new <>c__DisplayClass3_0(); <------ unnecessary allocation, probably due to the capture of numBits
<>c__DisplayClass3_.index = index;
<>c__DisplayClass3_.<>4__this = this;
if (Debugging.AssertsEnabled)
{
Debugging.Assert(<>c__DisplayClass3_.index >= 0 && <>c__DisplayClass3_.index < numBits, new Func<string>(<>c__DisplayClass3_.<Set>b__0));
}
int num = <>c__DisplayClass3_.index >> 6;
int num2 = <>c__DisplayClass3_.index & 0x3F;
long num3 = 1L << num2;
bits[num] |= num3;
}
} @NightOwl888 I'll push a pull request with only the fix for this asap |
@NightOwl888 : here is the proposed change for this case: #372 Would it be fine to just remove this overload of Debugging.Assert completely? We could also change to use something similar to how ZLogger implemented the zero allocation methods, by just having a couple of different overloads of Debugging.Assert with generic parameter types, and a call to string.Format() |
Correct. But if the application is running correctly, building the string will never occur. The string should only be built when there is an error. Building the string on every call and then discarding it slows down the tests.
Well, no because there are 2-3 cases where exceptions will be thrown when trying to build the string in the I would prefer a better solution if there is one - in Java, the string is only built in the case there is a failure, and it would be best to duplicate that so we don't have to deal with these failure cases. However, in cases where performance is being significantly affected in production, we can call the overload without the lambda. Frankly, I think the most performant solution would be just to eliminate the call to // Before
if (Debugging.AssertsEnabled) Debugging.Assert(outputLen > 0, () => "output contains empty string: " + scratchChars);
// After
if (Debugging.AssertsEnabled && !(outputLen > 0)) throw new AssertionException($"output contains empty string: {scratchChars}"); |
Think the only problem with that is that the JIT usually won't inline methods with a throw expression. |
But good point about exceptions when building the string. I'll try to only do the simple cases in a first pass (i.e. printing int values, etc) and leave the callback approach for the others for now |
Just to throw a sort of Nuclear option into the mix (And perhaps for some, a bit of a scary option as well), It would to some degree be possible to utilize AOP to actually inject code directly into the existing code at runtime. It has a small performance penalty (But I have not tested it in comparison to the design there is already in place, only made a Clean vs. AOP'ed method test), but for at least some of the examples here it would not be noticeable. It is a bit of a challenge for methods where the use of "Debugging" does not happen right at the beginning or the end, something that certainly has multiple solutions, but finding one where we can safely say it's "not just there" for testing could be difficult in some cases. If we take a really simple example with harmony:
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18362.1082 (1903/May2019Update/19H1)
Intel Core i9-7900X CPU 3.30GHz (Kaby Lake), 1 CPU, 20 logical and 10 physical cores
.NET Core SDK=5.0.100-rc.1.20452.10
[Host] : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
DefaultJob : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
|
…ters so the parameters are not resolved until a condition fails. There are still some calls that do light math and pick items from arrays, but this performance hit in the tests is something we can live with for better production performance. Closes apache#346, closes apache#373, closes apache#372.
Doing some benchmarking here, just saw this strange one on FixedBitSet.cs
@NightOwl888 do you mind if I replace this with a conditional compilation flag?
The text was updated successfully, but these errors were encountered: