Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PageRank #88

Merged
merged 13 commits into from
Mar 18, 2024
Merged

PageRank #88

merged 13 commits into from
Mar 18, 2024

Conversation

kirillgarbar
Copy link
Contributor

Proposed Changes

Added PageRank algorithm and some minor changes to BFS and SSSP

Types of changes

What types of changes does your code introduce to GraphBLAS-sharp?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • Build and tests pass locally
  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added necessary documentation (if appropriate)

Comment on lines +20 to +27
let one =
<@ fun (x: float32 option) (_: int option) ->
let mutable res = 0

match x with
| Some _ -> res <- 1
| None -> ()

if res = 0 then None else Some res @>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it shorter like this?

Suggested change
let one =
<@ fun (x: float32 option) (_: int option) ->
let mutable res = 0
match x with
| Some _ -> res <- 1
| None -> ()
if res = 0 then None else Some res @>
let one =
<@ fun (x: float32 option) (_: int option) ->
if x.IsNone then None else Some 1 @>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsNone is not supported in Brahma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can implement this with quotations. Is it worth doing this in Brahma?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IgorErin I'm not sure that I see an alternative way rather that to support IsNone translation in Brahma.

Copy link
Contributor

@IgorErin IgorErin Dec 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsvgit I'm thinking about a standard library for quotas in Brahma. Here you can, for example, iter. And you won't have to hardcode into the compiler.

Comment on lines 49 to 58
<@ fun (x: float32 option) y ->
let mutable res = 0.0f

match x, y with
| Some _, Some y -> res <- alpha / (float32 y)
| Some _, None -> ()
| None, Some _ -> ()
| None, None -> ()

if res = 0.0f then None else Some res @>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<@ fun (x: float32 option) y ->
let mutable res = 0.0f
match x, y with
| Some _, Some y -> res <- alpha / (float32 y)
| Some _, None -> ()
| None, Some _ -> ()
| None, None -> ()
if res = 0.0f then None else Some res @>
<@ fun (x: float32 option) y ->
let mutable res = None
match x, y with
| Some _, Some y -> res <- Some (alpha / (float32 y))
| _ -> ()
res @>

Comment on lines 15 to 18
let alpha = 0.85f
let accuracy = 0.00000001f

let countOutDegree (clContext: ClContext) workGroupSize =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it is worth making these constants and functions private. They are specific for current module, user does not want to see them.

@@ -244,3 +244,16 @@ module ArithmeticOperations =
| Some x, None -> Some x
| None, Some y -> Some y
| _ -> None @>

//PageRank specific
let minusAndSquare =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably name squareOfDifference sounds more clear?

Comment on lines 140 to 141
let alpha = 0.85f
let accuracy = 0.00000001f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants are repeating.


let prepareMatrix (clContext: ClContext) workGroupSize =

let alpha = 0.85f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeat.

open GraphBLAS.FSharp.Objects
open GraphBLAS.FSharp.Objects.MatrixExtensions

let testFixtures (testContext: TestContext) =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weak tests.

Comment on lines 19 to 26
let run = PageRank.run

let prepareMatrix = PageRank.prepareMatrix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that prepareMatrix should at least have some clarifying comments. Also, to run page rank algorithm user always has to call prepareMatrix first. And I don't think he wants to build a lego house every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always calling prepareMatrix before algorithm isn't necessary. User might already have matrix in correct format, for example in benchmarks. User may also want to obtain the prepared matrix and use it elsewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we inform the user about this need? What guarantees are there that this condition will be met?

We cannot afford to set such semantic traps at any level of abstraction.

If a precondition arises, it must be expressed in types so that the user is not able to pass an unprepared matrix.

Otherwise, any mention of functional style should be removed from the title of our work "graphblas in functional style".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we do not need to inform user about it because it is not our goal to provide PageRank implementation but to provide enough building blocks to implement it.
This implementation is nothing more but example and test of our api.
It is required to know how PageRank works to implement it using our library, so I think we should not care a lot about specific graph algorithms such as BFS or PageRank inside our library

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsvgit What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready-to-use garph analysis algorithms should be a part of library. It is the next (possibli highest) leyer of our solution (look at LaGraph). So, implementations of BFS, PAgeRank and other algorithms should be as clear and documented, as possible. Anyway, even basic implmenetation should be a clear anough to be a starting point for those who want ot develop new algorithms using provieed building blocks.

Regarding types. I agree with @IgorErin: if we can express something using types, we should do it. Ofcourse we should find the best way to do it.

@@ -14,3 +14,8 @@ module Algorithms =

module SSSP =
let singleSource = SSSP.run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let singleSource = SSSP.run
let run = SSSP.run

@kirillgarbar
Copy link
Contributor Author

Added PageRankMatrix type as single case union type which is produced by prepareMatrix

@@ -67,14 +67,14 @@ module internal BFS =
levels

let singleSourceSparse
(add: Expr<bool option -> bool option -> bool option>)
(mul: Expr<bool option -> bool option -> bool option>)
(add: Expr<int option -> int option -> int option>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we should use int instead of bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason is that spla and GraphBLAST use int and float matrices on BFS and comparison will be more fair. But now I think it is a strange reason so I'll change that to bool again

@@ -19,3 +19,8 @@ module Algorithms =

module SSSP =
let run = SSSP.run

module PageRank =
let run = PageRank.run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think documentation is needed. At least we should say that matrix has to be prepared first by calling prepareMatrix method.

let mutable matrixPrepared = Unchecked.defaultof<PageRankMatrix<float32>>
let mutable matrixHost = Unchecked.defaultof<_>

let accuracy = 0.00000001f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are predefined constants in Expecto.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, why is it necessary separately? Perhaps we need to create a module with constants and actions on them in order to avoid mistakes in the future?

Abstract types could also be used here, so that they could be used only with the help of this most magical module.

Benchmarks.InputMatrixProviderBuilder "BFSBenchmarks.txt"

[<GlobalSetup>]
override this.GlobalSetup() =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always thought about it. Don't you need to make the preparatory methods blocking? Then is it possible to avoid blocking in the [<Benchmark>] itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a good idea to block after preparotory methods in case some of them are not blocking by itself, but blocking in the [<Benchmark>] is still necessary to wait for all kernels to execute.

Comment on lines +20 to +27
let one =
<@ fun (x: float32 option) (_: int option) ->
let mutable res = 0

match x with
| Some _ -> res <- 1
| None -> ()

if res = 0 then None else Some res @>
Copy link
Contributor

@IgorErin IgorErin Dec 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsvgit I'm thinking about a standard library for quotas in Brahma. Here you can, for example, iter. And you won't have to hardcode into the compiler.

/// <summary>
/// Represents an abstraction over matrix, which is converted to correct format for PageRank algorithm
/// </summary>
type PageRankMatrix<'a when 'a: struct> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. But it needs to be even stronger. Abstract types are needed here. Like here in the first example.

https://okmij.org/ftp/Computation/lightweight-static-guarantees.html

I was told that this can be done using signatures. The idea is that we should not allow anyone to use this type unless they have passed some kind of verification. It seems to me that this also applies to disjoint matrices, @artemiipatov .

Of course, this can be disabled during benchmarking.

In the same way, we will add another functional and at the same time zero abstraction. This is exactly what is needed on FHPNC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signatures are not working?

open GraphBLAS.FSharp.Objects

let private alpha = 0.85f
let private accuracy = 0.00001f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expecto have some constants like that. We use they in tests

@gsvgit gsvgit requested a review from IgorErin December 18, 2023 09:20
Copy link
Contributor

@IgorErin IgorErin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried them in F#, is it possible to use abstract types in the language?

@@ -113,10 +113,12 @@ type WithoutTransferBenchmark<'elem when 'elem : struct>(
override this.GlobalSetup() =
this.ReadMatrix()
this.LoadMatrixToGPU()
this.Processor.PostAndReply(Msg.MsgNotifyMe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much is it necessary to duplicate this? Is it possible to make an extension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make extension for specialization of generic MailboxProcessor and failed, as far as I know it is possible only for static members and static method would look not much better than this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Seems like that. Than maybe just

module Queue = 
    let block (q: MailboxProcessor<Msg>) = q.PostAndReply(Msg.MsgNotifyMe)

in GraphBlas-sharp.Backend.Objects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that this may work.

[<System.Runtime.CompilerServices.Extension>] // На F# 8 можно убрать.
type MailboxProcessorExts =
    [<System.Runtime.CompilerServices.Extension>]
    static member Block(this : MailboxProcessor<int>) =
        this.Post 42

We can choose!

let mutable matrixPrepared = Unchecked.defaultof<PageRankMatrix<float32>>
let mutable matrixHost = Unchecked.defaultof<_>

let accuracy = 0.00000001f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, why is it necessary separately? Perhaps we need to create a module with constants and actions on them in order to avoid mistakes in the future?

Abstract types could also be used here, so that they could be used only with the help of this most magical module.

/// <summary>
/// Represents an abstraction over matrix, which is converted to correct format for PageRank algorithm
/// </summary>
type PageRankMatrix<'a when 'a: struct> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signatures are not working?

let addition =
create queue DeviceOnly vertexCount Dense (Some((1.0f - alpha) / (float32 vertexCount)))

let mutable error = accuracy + 0.1f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all about a separate module for constants. Then it would be possible to take out actions like reciprocal and the like. Then, perhaps such actions will look more meaningful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use predefined ones. Or comment on it. It's just that the number of magic constants scares me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still confuses me. And float arithmetic 3 line above too!

@kirillgarbar
Copy link
Contributor Author

kirillgarbar commented Dec 21, 2023

Added signature file for PageRank module so it is now impossible to do something withPageRankMatrix from outside.
Had to made module Backend.PageRank not internal because type PageRankMatrix was not visible from Algorithms.PageRank module, all members except for type were made internal instead.
Also, I was forced to do implementation of type as discriminated union due to compiler arguing about type abbreviation.

@IgorErin
Copy link
Contributor

@gsvgit Can we disable document assembly on Windows? This red mark is disorienting

Copy link
Contributor

@IgorErin IgorErin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. There's nothing left

@@ -113,10 +113,12 @@ type WithoutTransferBenchmark<'elem when 'elem : struct>(
override this.GlobalSetup() =
this.ReadMatrix()
this.LoadMatrixToGPU()
this.Processor.PostAndReply(Msg.MsgNotifyMe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Seems like that. Than maybe just

module Queue = 
    let block (q: MailboxProcessor<Msg>) = q.PostAndReply(Msg.MsgNotifyMe)

in GraphBlas-sharp.Backend.Objects

let addition =
create queue DeviceOnly vertexCount Dense (Some((1.0f - alpha) / (float32 vertexCount)))

let mutable error = accuracy + 0.1f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still confuses me. And float arithmetic 3 line above too!

@@ -113,10 +113,12 @@ type WithoutTransferBenchmark<'elem when 'elem : struct>(
override this.GlobalSetup() =
this.ReadMatrix()
this.LoadMatrixToGPU()
this.Processor.PostAndReply(Msg.MsgNotifyMe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was told that this may work.

[<System.Runtime.CompilerServices.Extension>] // На F# 8 можно убрать.
type MailboxProcessorExts =
    [<System.Runtime.CompilerServices.Extension>]
    static member Block(this : MailboxProcessor<int>) =
        this.Post 42

We can choose!

/// </summary>
let alpha = 0.85f

module Common =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how convenient it is to move something that is used only in tests here. But okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously it was in Backend.Utils and used in SpMV for the reason that I forgot. I imagine we can use it somewhere if we know for sure that this is the optimal size for the algorithm

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the optimal size should be in benchmarks. And there will be several of them, depending on the hardware and other things. So that it is not just optimal, but everyone can check it relatively quickly in a particular case, just by cloning the repository and writing a couple of commands. But that's sometime later.

If this was hardcoded in SpMV, then shouldn't we make a separate submodule for SpMV in the constants module? Just like you did for PageRank?

It is better to use a constant from the SpMV module in tests than to use a constant from tests in SpMV.

His whole idea is just not to forget where the constants came from or what actions to take with them. In my opinion, putting it in a separate module with an obvious name while providing comments with the reason is very good.

open GraphBLAS.FSharp.Objects.ArraysExtensions
open GraphBLAS.FSharp.Objects.ClCellExtensions

module PageRank =
Copy link
Contributor

@IgorErin IgorErin Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[<RequierQulifierAccess>] for all API modules

@kirillgarbar
Copy link
Contributor Author

Don't know why I can't reply to a comment about let mutable error.
This variable is created in such way that while cycle will be entered, looks pretty clear to me.
Also, explaining why vectors above are created in such way in a few lines of comments instead of article about PageRank is not possible, implementing PageRank implies that person would understand why such values are used.
Only reasonable thing to do in this case might be a link to the article in the comment

@IgorErin
Copy link
Contributor

Yes, then we can stop at the comments

@gsvgit gsvgit merged commit 7fe8ff3 into SparseLinearAlgebra:dev Mar 18, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants