-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/phrase search #6
Open
mobinbr
wants to merge
10
commits into
main
Choose a base branch
from
feature/phrase-search
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
cd88452
init: start phase 5
rajabi-m 8b8a3b4
refactor: remove AddDocumentToKeyword from IInvertedIndex
rajabi-m 74b4a28
test: add tests for complex parsing
rajabi-m 758e1d0
feat: change InputParser to pass complex parsing tests
amiralirahimii b7d9442
feat: add AdvancedInvertedIndex.cs
amiralirahimii d56c880
feat: complete AdvancedInvertedIndex to pass the tests
rajabi-m 9ec7dda
feat: add FilesAdvancedInvertedIndexBuilder
rajabi-m 52ef369
fix: fix AdvancedInvertedIndex equality
mobinbr 5a95dea
test: expand InvertedIndexSearcherTest
mobinbr eb01596
refactor: apply viewers comments.
rajabi-m File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
| ||
Microsoft Visual Studio Solution File, Format Version 12.00 | ||
# Visual Studio Version 17 | ||
VisualStudioVersion = 17.0.31903.59 | ||
MinimumVisualStudioVersion = 10.0.40219.1 | ||
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "DocumentManagement", "src\DocumentManagement\DocumentManagement.csproj", "{D471FD38-26BA-4DEF-96A2-982F235AEA01}" | ||
EndProject | ||
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "App", "src\App\App.csproj", "{0735FD70-626C-4D88-94E3-4B410C5A41E9}" | ||
EndProject | ||
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "test", "test\test.csproj", "{FD776EED-DBA5-4789-A1A7-FF97EF11FBD3}" | ||
EndProject | ||
Global | ||
GlobalSection(SolutionConfigurationPlatforms) = preSolution | ||
Debug|Any CPU = Debug|Any CPU | ||
Release|Any CPU = Release|Any CPU | ||
EndGlobalSection | ||
GlobalSection(ProjectConfigurationPlatforms) = postSolution | ||
{D471FD38-26BA-4DEF-96A2-982F235AEA01}.Debug|Any CPU.ActiveCfg = Debug|Any CPU | ||
{D471FD38-26BA-4DEF-96A2-982F235AEA01}.Debug|Any CPU.Build.0 = Debug|Any CPU | ||
{D471FD38-26BA-4DEF-96A2-982F235AEA01}.Release|Any CPU.ActiveCfg = Release|Any CPU | ||
{D471FD38-26BA-4DEF-96A2-982F235AEA01}.Release|Any CPU.Build.0 = Release|Any CPU | ||
{0735FD70-626C-4D88-94E3-4B410C5A41E9}.Debug|Any CPU.ActiveCfg = Debug|Any CPU | ||
{0735FD70-626C-4D88-94E3-4B410C5A41E9}.Debug|Any CPU.Build.0 = Debug|Any CPU | ||
{0735FD70-626C-4D88-94E3-4B410C5A41E9}.Release|Any CPU.ActiveCfg = Release|Any CPU | ||
{0735FD70-626C-4D88-94E3-4B410C5A41E9}.Release|Any CPU.Build.0 = Release|Any CPU | ||
{FD776EED-DBA5-4789-A1A7-FF97EF11FBD3}.Debug|Any CPU.ActiveCfg = Debug|Any CPU | ||
{FD776EED-DBA5-4789-A1A7-FF97EF11FBD3}.Debug|Any CPU.Build.0 = Debug|Any CPU | ||
{FD776EED-DBA5-4789-A1A7-FF97EF11FBD3}.Release|Any CPU.ActiveCfg = Release|Any CPU | ||
{FD776EED-DBA5-4789-A1A7-FF97EF11FBD3}.Release|Any CPU.Build.0 = Release|Any CPU | ||
EndGlobalSection | ||
GlobalSection(SolutionProperties) = preSolution | ||
HideSolutionNode = FALSE | ||
EndGlobalSection | ||
EndGlobal |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<ItemGroup> | ||
<ProjectReference Include="..\DocumentManagement\DocumentManagement.csproj"/> | ||
</ItemGroup> | ||
|
||
<PropertyGroup> | ||
<RootNamespace>Mohaymen.FullTextSearch.App</RootNamespace> | ||
<OutputType>Exe</OutputType> | ||
<TargetFramework>net8.0</TargetFramework> | ||
<ImplicitUsings>enable</ImplicitUsings> | ||
<Nullable>enable</Nullable> | ||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<PackageReference Include="Microsoft.Extensions.Logging" Version="8.0.0"/> | ||
<PackageReference Include="Microsoft.Extensions.Logging.Console" Version="8.0.0"/> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<EmbeddedResource Update="Assets\Resources.resx"> | ||
<Generator>ResXFileCodeGenerator</Generator> | ||
<LastGenOutput>Resources.Designer.cs</LastGenOutput> | ||
</EmbeddedResource> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<None Update="Assets/**"> | ||
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory> | ||
</None> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<Compile Update="Assets\Resources.Designer.cs"> | ||
<DesignTime>True</DesignTime> | ||
<AutoGen>True</AutoGen> | ||
<DependentUpon>Resources.resx</DependentUpon> | ||
</Compile> | ||
</ItemGroup> | ||
|
||
</Project> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
using Mohaymen.FullTextSearch.DocumentManagement.Models; | ||
|
||
namespace Mohaymen.FullTextSearch.App.Interfaces; | ||
|
||
public interface IInputParser | ||
{ | ||
List<SearchQuery> ParseToSearchQuery(string input); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
using Microsoft.Extensions.Logging; | ||
using Mohaymen.FullTextSearch.App.Utilities; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Models; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Services.InvertedIndexService; | ||
using Mohaymen.FullTextSearch.App.Services; | ||
using Mohaymen.FullTextSearch.App.UI; | ||
using Mohaymen.FullTextSearch.Assets; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Interfaces; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Services.FilesService; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Utilities; | ||
|
||
namespace Mohaymen.FullTextSearch.App; | ||
|
||
internal class Program | ||
{ | ||
public static void Main() | ||
{ | ||
var documentsPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, Resources.DocumentsPath); | ||
var fileLoader = new FileLoader(new FileReader()); | ||
var fileCollection = fileLoader.LoadFiles(documentsPath); | ||
var invertedIndex = IndexFiles(fileCollection); | ||
var invertedIndexSearcher = new InvertedIndexSearcher(invertedIndex); | ||
var parser = new InputParser(); | ||
var userInterface = new UserInterface(invertedIndexSearcher, parser); | ||
userInterface.StartProgramLoop(); | ||
} | ||
|
||
private static IInvertedIndex IndexFiles(FileCollection fileCollection) | ||
{ | ||
Logging<Program>.Logger.LogInformation("Processing files..."); | ||
var tokenizer = new Tokenizer(); | ||
var advancedInvertedIndexBuilder = new FilesAdvancedInvertedIndexBuilder(tokenizer); | ||
var invertedIndex = advancedInvertedIndexBuilder.IndexFilesWords(fileCollection).Build(); | ||
Logging<Program>.Logger.LogInformation("{fileCount} files loaded.", fileCollection.FilesCount()); | ||
return invertedIndex; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
using Microsoft.Extensions.Logging; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Models; | ||
using Mohaymen.FullTextSearch.App.Utilities; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Interfaces; | ||
|
||
namespace Mohaymen.FullTextSearch.App.Services; | ||
|
||
public class FileLoader | ||
{ | ||
private IFileReader _fileReader; | ||
|
||
public FileLoader(IFileReader fileReader) | ||
{ | ||
_fileReader = fileReader; | ||
} | ||
|
||
public FileCollection LoadFiles(string documentsPath) | ||
{ | ||
try | ||
{ | ||
var fileCollection = _fileReader.ReadAllFiles(documentsPath); | ||
return fileCollection; | ||
} | ||
catch (DirectoryNotFoundException exception) | ||
{ | ||
Logging<FileLoader>.Logger.LogError(exception, "Wrong Folder Path: {path}", documentsPath); | ||
throw; | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
using Mohaymen.FullTextSearch.App.Interfaces; | ||
using Mohaymen.FullTextSearch.Assets; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Interfaces; | ||
|
||
|
||
namespace Mohaymen.FullTextSearch.App.UI; | ||
|
||
public class UserInterface | ||
{ | ||
private readonly ISearcher<string> _searcher; | ||
private readonly IInputParser _parser; | ||
private const string QuitCommand = "!q"; | ||
|
||
public UserInterface(ISearcher<string> searcher, IInputParser parser) | ||
{ | ||
ArgumentNullException.ThrowIfNull(searcher); | ||
ArgumentNullException.ThrowIfNull(parser); | ||
_searcher = searcher; | ||
_parser = parser; | ||
} | ||
|
||
public void StartProgramLoop() | ||
{ | ||
while (true) | ||
{ | ||
var input = GetInput(); | ||
|
||
if (input == QuitCommand) | ||
break; | ||
|
||
var containingFiles = GetContainingFiles(input); | ||
|
||
DisplayResult(containingFiles); | ||
} | ||
} | ||
|
||
private ICollection<string> GetContainingFiles(string input) | ||
{ | ||
var searchQueries = _parser.ParseToSearchQuery(input); | ||
return _searcher.Search(searchQueries); | ||
} | ||
|
||
private void DisplayResult(ICollection<string> containingFiles) | ||
{ | ||
if (containingFiles.Count == 0) | ||
{ | ||
Console.WriteLine("No result for your statement"); | ||
return; | ||
} | ||
|
||
var count = containingFiles.Count; | ||
Console.WriteLine($"Word found in {count} file{(count > 1 ? "s" : "")}"); | ||
Console.WriteLine("----------------------"); | ||
foreach (var filePath in containingFiles) | ||
{ | ||
var documentsPath = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, Resources.DocumentsPath); | ||
var relativePath = Path.GetRelativePath(documentsPath, filePath); | ||
Console.WriteLine($"File '{relativePath}'"); | ||
}; | ||
Console.WriteLine("----------------------"); | ||
} | ||
|
||
private string GetInput() | ||
{ | ||
Console.Write("Enter your statement (Enter !q to exit): "); | ||
var input = Console.ReadLine()?.Trim() ?? ""; | ||
return input; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
using System.Text.RegularExpressions; | ||
using Mohaymen.FullTextSearch.App.Interfaces; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Models; | ||
using Mohaymen.FullTextSearch.DocumentManagement.Services.InvertedIndexService.SearchStrategies; | ||
|
||
namespace Mohaymen.FullTextSearch.App.Utilities; | ||
|
||
public class InputParser : IInputParser | ||
{ | ||
// SplitRegexPattern matches single words and phrases | ||
private const string SplitRegexPattern = @"[+-]?\b\w+\b|[+-]?""[^""]+"""; | ||
public List<SearchQuery> ParseToSearchQuery(string input) | ||
{ | ||
var mandatoryWords = new List<Keyword>(); | ||
var optionalWords = new List<Keyword>(); | ||
var excludedWords = new List<Keyword>(); | ||
|
||
var regex = new Regex(SplitRegexPattern); | ||
|
||
var matches = regex.Matches(input); | ||
|
||
foreach (Match match in matches) | ||
{ | ||
var word = match.Value; | ||
|
||
if (word.StartsWith('+')) | ||
optionalWords.Add(new Keyword(word.Substring(1).Trim('"'))); | ||
else if (word.StartsWith('-')) | ||
excludedWords.Add(new Keyword(word.Substring(1).Trim('"'))); | ||
else | ||
mandatoryWords.Add(new Keyword(word.Trim('"'))); | ||
} | ||
|
||
return [ | ||
new SearchQuery(new MandatorySearchStrategy(), mandatoryWords), | ||
new SearchQuery(new OptionalSearchStrategy(), optionalWords), | ||
new SearchQuery(new ExcludedSearchStrategy(), excludedWords) | ||
]; | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
using Microsoft.Extensions.Logging; | ||
|
||
namespace Mohaymen.FullTextSearch.App.Utilities; | ||
|
||
public static class Logging<TCategoryName> | ||
{ | ||
public static ILogger<TCategoryName> Logger{get; private set;} | ||
|
||
static Logging() | ||
{ | ||
using var loggerFactory = LoggerFactory.Create(builder => | ||
{ | ||
builder.AddConsole(); | ||
}); | ||
Logger = loggerFactory.CreateLogger<TCategoryName>(); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
add your docs to this folder |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
<?xml version="1.0" encoding="utf-8"?> | ||
|
||
<root> | ||
<xsd:schema id="root" xmlns="" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"> | ||
<xsd:element name="root" msdata:IsDataSet="true"> | ||
|
||
</xsd:element> | ||
</xsd:schema> | ||
<resheader name="resmimetype"> | ||
<value>text/microsoft-resx</value> | ||
</resheader> | ||
<resheader name="version"> | ||
<value>1.3</value> | ||
</resheader> | ||
<resheader name="reader"> | ||
<value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value> | ||
</resheader> | ||
<resheader name="writer"> | ||
<value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value> | ||
</resheader> | ||
<data name="DocumentsPath" xml:space="preserve"> | ||
<value>Assets\Documents</value> | ||
</data> | ||
</root> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_searcher = searcher ?? throw new ArgumentNullException(nameof(searcher));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
به نظرمون روش ما readabilityش بالاتره
باز اگه دلیل خاصی براش دارین لطفا بیشتر توضیح بدین