Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adds substring method for Text with tests #523

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 49 additions & 1 deletion src/Text.mo
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ module {
};

/// Splits the input `Text` with the specified `Pattern`.
///
///
/// Two fields are separated by exactly one match.
///
/// ```motoko include=import
Expand Down Expand Up @@ -425,6 +425,54 @@ module {
}
};

/// Returns a substring of the input `Text` delimited by the specified `Pattern`, provided with a starting position and a length.
/// If no length is passed, returns empty string.
///
/// ```motoko include=import
/// Text.substring("This is a sentence.", 0, 4) // "This"
/// Text.substring("This is a sentence.", 5, 4) // "is a"
/// Text.substring("This is a sentence.", 0, 0) // ""
/// ```
public func substring(t : Text, start : Int, len : Int) : Text {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you've got something more clever planned for negative args, I'd use Nat for start and len and then just two (explicit) loops over the iterator's elements (using iter.next() explicitly). One to loop advance the iterator by count, the second to append the chars. The other question is whether you want to trap when out-of-bounds or return null with return type ?Text, not Text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense in terms of Motoko conventions. I was basing this off of a script I was writing where I was using it in conjunction with indexOf, which returned a -1 if the pattern wasn't found, but both can simply return options and use Nats instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was based off of the JavaScript patterns, where you are allowed to use negative indexes to count backward from the end of iterable structures. There's no need to introduce that to Motoko, however

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in my opinion it's nicer to return an empty string over null. I've defined those cases in the doc comment, where start or length exceeds the length of a base string, it will

  • return "" if start is beyond input.size()
  • return the substring from the starting position to the end of the input if length exceeds the rest

var output = "";
var count = 0;
//handle negative length
if (len < 0) {
for (char in t.chars()) {
// handle negative start
if (start < 0) {
if (count >= t.size() + start) {
output := output # fromChar(char)
};
count := count + 1
};
// handle positive start
if (count >= start) {
output := output # fromChar(char);
count := count + 1
}
};
return output
};

// handle positive length
for (char in t.chars()) {
// handle negative start
if (start < 0) {
if (count >= t.size() + start and count < t.size() + start + len) {
output := output # fromChar(char)
};
count := count + 1
};
// handle positive start
if (count >= start and count < start + len) {
output := output # fromChar(char);
count := count + 1
}
};
output
};

/// Returns a sequence of tokens from the input `Text` delimited by the specified `Pattern`, derived from start to end.
/// A "token" is a non-empty maximal subsequence of `t` not containing a match for pattern `p`.
/// Two tokens may be separated by one or more matches of `p`.
Expand Down
68 changes: 68 additions & 0 deletions test/textTest.mo
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,74 @@ run(
)
);

run(
suite(
"substring",
[
test(
"zero length",
Text.substring("abc", 0, 0),
M.equals(T.text "")
),
test(
"length of 1 from start",
Text.substring("abc", 0, 1),
M.equals(T.text "a")
),
test(
"length of 2 from start",
Text.substring("abc", 0, 2),
M.equals(T.text "ab")
),
test(
"length of 3 from start",
Text.substring("abc", 0, 3),
M.equals(T.text "abc")
),
test(
"length of 1 from middle",
Text.substring("abc", 1, 1),
M.equals(T.text "b")
),
test(
"length of 2 from middle",
Text.substring("abc", 1, 2),
M.equals(T.text "bc")
),
test(
"length of 1 from end",
Text.substring("abc", 2, 1),
M.equals(T.text "c")
),
test(
"length of 2 from end",
Text.substring("abc", 2, 2),
M.equals(T.text "c")
),
test(
"should handle negative start",
Text.substring("abc", -1, 1),
M.equals(T.text "c")
),
test(
"should handle negative length",
Text.substring("abc", 0, -1),
M.equals(T.text "")
),
test(
"should handle negative start and length",
Text.substring("abc", -1, -1),
M.equals(T.text "")
),
test(
"should handle start past end",
Text.substring("abc", 3, 1),
M.equals(T.text "")
)
]
)
);

do {
let pat : Text.Pattern = #predicate(func(c : Char) : Bool { c == ';' or c == '!' });
run(
Expand Down