-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make word separators and splitters more flexible #402
base: master
Are you sure you want to change the base?
Changes from all commits
e6b7c3d
637809c
14fa737
a105b03
e992b18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
use hyphenation::{Language, Load, Standard}; | ||
use rand::Rng as _; | ||
use textwrap::word_separators::WordSeparator as _; | ||
|
||
#[derive(Debug)] | ||
struct StyledWord<'a> { | ||
word: &'a str, | ||
whitespace: &'a str, | ||
hyphen: bool, | ||
style: Option<text_style::Style>, | ||
} | ||
|
||
impl StyledWord<'_> { | ||
fn render(&self, is_end: bool) { | ||
use text_style::termion::Termion as _; | ||
|
||
print!( | ||
"{}", | ||
text_style::StyledStr::new(self.word, self.style).termion() | ||
); | ||
|
||
if is_end { | ||
if self.hyphen { | ||
print!("{}", text_style::StyledStr::new("-", self.style).termion()); | ||
} | ||
} else { | ||
print!("{}", self.whitespace); | ||
} | ||
} | ||
} | ||
|
||
impl AsRef<str> for StyledWord<'_> { | ||
fn as_ref(&self) -> &str { | ||
&self.word | ||
} | ||
} | ||
|
||
impl<'a> From<text_style::StyledStr<'a>> for StyledWord<'a> { | ||
fn from(word: text_style::StyledStr<'a>) -> Self { | ||
let trimmed = word.s.trim_end_matches(' '); | ||
Self { | ||
word: trimmed, | ||
whitespace: &word.s[trimmed.len()..], | ||
hyphen: false, | ||
style: word.style, | ||
} | ||
} | ||
} | ||
|
||
impl textwrap::core::Fragment for StyledWord<'_> { | ||
fn width(&self) -> usize { | ||
self.word.len() | ||
} | ||
|
||
fn whitespace_width(&self) -> usize { | ||
self.whitespace.len() | ||
} | ||
|
||
fn penalty_width(&self) -> usize { | ||
if self.hyphen { | ||
1 | ||
} else { | ||
0 | ||
} | ||
} | ||
} | ||
|
||
impl textwrap::word_splitters::Splittable for StyledWord<'_> { | ||
type Output = Self; | ||
|
||
fn split(&self, range: std::ops::Range<usize>, keep_ending: bool) -> Self::Output { | ||
let word = &self.word[range]; | ||
Self { | ||
word, | ||
whitespace: if keep_ending { self.whitespace } else { "" }, | ||
hyphen: if keep_ending { | ||
self.hyphen | ||
} else { | ||
!word.ends_with('-') | ||
}, | ||
style: self.style, | ||
} | ||
} | ||
} | ||
|
||
fn generate_style(rng: &mut impl rand::Rng) -> text_style::Style { | ||
let mut style = text_style::Style::default(); | ||
|
||
style.set_bold(rng.gen_bool(0.1)); | ||
style.set_italic(rng.gen_bool(0.1)); | ||
style.set_underline(rng.gen_bool(0.1)); | ||
style.strikethrough(rng.gen_bool(0.01)); | ||
|
||
style.fg = match rng.gen_range(0..100) { | ||
0..=10 => Some(text_style::AnsiColor::Red), | ||
11..=20 => Some(text_style::AnsiColor::Green), | ||
21..=30 => Some(text_style::AnsiColor::Blue), | ||
_ => None, | ||
} | ||
.map(|color| text_style::Color::Ansi { | ||
color, | ||
mode: text_style::AnsiMode::Light, | ||
}); | ||
|
||
style | ||
} | ||
|
||
fn main() { | ||
let dictionary = Standard::from_embedded(Language::EnglishUS).unwrap(); | ||
let mut rng = rand::thread_rng(); | ||
|
||
let text = lipsum::lipsum(rng.gen_range(100..500)); | ||
|
||
let styled = text | ||
.split_inclusive(' ') | ||
.map(|s| text_style::StyledStr::styled(s, generate_style(&mut rng))); | ||
let words: Vec<_> = styled | ||
.flat_map(|s| { | ||
textwrap::word_separators::AsciiSpace | ||
.find_word_ranges(&s.s) | ||
.map(move |range| text_style::StyledStr::new(&s.s[range], s.style)) | ||
}) | ||
.map(StyledWord::from) | ||
.flat_map(|w| textwrap::word_splitters::Fragments::new(w, &dictionary)) | ||
.collect(); | ||
|
||
let lines = textwrap::wrap_algorithms::wrap_first_fit(&words, &[50]); | ||
for line in lines { | ||
for (idx, fragment) in line.into_iter().enumerate() { | ||
fragment.render(idx + 1 == line.len()); | ||
} | ||
println!(); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -190,36 +190,79 @@ where | |
I: IntoIterator<Item = Word<'a>>, | ||
WordSplit: WordSplitter, | ||
{ | ||
words.into_iter().flat_map(move |word| { | ||
let mut prev = 0; | ||
let mut split_points = word_splitter.split_points(&word).into_iter(); | ||
std::iter::from_fn(move || { | ||
if let Some(idx) = split_points.next() { | ||
let need_hyphen = !word[..idx].ends_with('-'); | ||
let w = Word { | ||
word: &word.word[prev..idx], | ||
width: display_width(&word[prev..idx]), | ||
whitespace: "", | ||
penalty: if need_hyphen { "-" } else { "" }, | ||
}; | ||
prev = idx; | ||
return Some(w); | ||
} | ||
words | ||
.into_iter() | ||
.flat_map(move |word| Fragments::new(word, word_splitter)) | ||
} | ||
|
||
if prev < word.word.len() || prev == 0 { | ||
let w = Word { | ||
word: &word.word[prev..], | ||
width: display_width(&word[prev..]), | ||
whitespace: word.whitespace, | ||
penalty: word.penalty, | ||
}; | ||
prev = word.word.len() + 1; | ||
return Some(w); | ||
} | ||
#[allow(missing_docs)] | ||
pub trait Splittable: AsRef<str> { | ||
type Output; | ||
|
||
#[allow(missing_docs)] | ||
fn split(&self, range: std::ops::Range<usize>, keep_ending: bool) -> Self::Output; | ||
} | ||
|
||
None | ||
}) | ||
}) | ||
impl<'a> Splittable for Word<'a> { | ||
type Output = Self; | ||
|
||
fn split(&self, range: std::ops::Range<usize>, keep_ending: bool) -> Self::Output { | ||
let word = &self.word[range]; | ||
Word { | ||
word, | ||
width: display_width(word), | ||
whitespace: if keep_ending { self.whitespace } else { "" }, | ||
penalty: if keep_ending { | ||
self.penalty | ||
} else if !word.ends_with('-') { | ||
"-" | ||
} else { | ||
"" | ||
}, | ||
} | ||
} | ||
} | ||
|
||
#[allow(missing_docs)] | ||
#[derive(Debug)] | ||
pub struct Fragments<W: Splittable, I: Iterator<Item = usize>> { | ||
word: W, | ||
split_points: I, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This type parameter is unnecessary, we can directly use |
||
prev: usize, | ||
} | ||
|
||
impl<W: Splittable> Fragments<W, std::vec::IntoIter<usize>> { | ||
#[allow(missing_docs)] | ||
pub fn new(word: W, word_splitter: &impl WordSplitter) -> Self { | ||
let split_points = word_splitter.split_points(word.as_ref()).into_iter(); | ||
Self { | ||
word, | ||
split_points, | ||
prev: 0, | ||
} | ||
} | ||
} | ||
|
||
impl<W: Splittable, I: Iterator<Item = usize>> Iterator for Fragments<W, I> { | ||
type Item = W::Output; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
if let Some(idx) = self.split_points.next() { | ||
let w = self.word.split(self.prev..idx, false); | ||
self.prev = idx; | ||
return Some(w); | ||
} | ||
|
||
let len = self.word.as_ref().len(); | ||
if self.prev < len || self.prev == 0 { | ||
let w = self.word.split(self.prev..len, true); | ||
// TODO: shouldn’t this be just len? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I should have added a comment to explain this: with just
you get an infinite loop when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If you remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, thanks. Will remove the TODO. |
||
self.prev = len + 1; | ||
return Some(w); | ||
} | ||
|
||
None | ||
} | ||
} | ||
|
||
#[cfg(test)] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit of a shame that the example doesn't show why there are two traits :-) I love the example in itself, it's super great at demonstrating the concept of wrapping not-just-plain-text. However, it would be nice if it would exploit the two traits better.
Will you be having different structs in
genpdf
, one forFragment
and another forSplittable
? if not, then I would prefer to keep the number of concepts low and add asplit
method toFragment
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, you already explained that you have a
StyledWord
struct for the unmeasured case and aStyledFragment
for the pre-split and measured words.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly, the distinction is especially relevant if the width computation is non-trivial, which is typically the case for scenarios other than the terminal. I could add an example that produces a PDF file, but I think that would be too complex to be useful as an example for
textwrap
. Maybe we can have an example that produces an SVG image?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that a full-blown PDF seems unnecessary — could you instead pretend that you need two structs in the
style
example? I believe you're usinglen()
on the strings, which is cheating ever so slightly :-)It might look a bit arbitrary, but for educational purposes, I think we're allowed to exaggerate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can do that.