Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback From a User #173

Closed
Liskelleo opened this issue Dec 25, 2022 · 14 comments
Closed

Feedback From a User #173

Liskelleo opened this issue Dec 25, 2022 · 14 comments
Assignees
Labels

Comments

@Liskelleo
Copy link

Liskelleo commented Dec 25, 2022

Hello author, I am a sophomore and I have made a simple formula editor based on your library. I'm not major in computer science, so I only have some ideas and cannot put these thoughts into practice. I think that version 0.2.0 of this library has a great improvement over the previous versions, but I think there are still something need to be improved. The following items are feedback based on my experience as a user and I wish you will take them into consideration.
1 The inability to convert integral expressions
2 The inability to display matrices
3 The inability to display some special types of letters (such as x_hat, x_tilde, x_bar, etc.)
4 The inability to display some special symbols (such as mod, Hamiltonian operator, partial differential operator, etc.)
5 Some strange output (pictured), where the "∈" symbol does not conform to the specification

sqrt(sum(x**2 for x in coordinates))

image
6 Only the power form can be written, not the superscript (what if I want to write this symbol: "$x^{b}$")
7 It is unable to display the n-th root sign, maybe we can let "$\sqrt[n]{x}$" be represented as "pow(x, 1/n)"
8 Why not add expressions in "math" library to this library, like "exp()", "angle()", "log(m, n)", etc., which can make it more functional?
P.S. Can you provide a complete user manual, showing all formulas available in the current version? I think there are too few examples provided in documents on GitHub.
From my perspective, the difficulty lies in how to enable the input expression to be executed by Python on the one hand and displayed correctly on the other (I mean successfully converted to LaTex formular).
I have seen some of the above points in the issue section, and the above points are the main ones I want to mention.
Best wishes and marry Christmas^^

@odashi
Copy link
Collaborator

odashi commented Dec 25, 2022

Hi @Liskelleo , thanks for reporting many suggestions! As several features you suggested is already supported in the main, could you check how the newest version works? These changes will be released on v0.3.

Can you provide a complete user manual, showing all formulas available in the current version? I think there are too few examples provided in documents on GitHub.

It is possible to increase the number of examples. Showing "all" formulas is somewhat unable because it is essentially equal to representing the all rules in this library in another format. I think it is feasible to add only examples of supported functions, operators, and some special functionalities.

From my perspective, the difficulty lies in how to enable the input expression to be executed by Python on the one hand and displayed correctly on the other (I mean successfully converted to LaTex formular).

Yes. The objective of this library is to show the given Python snippet as a "formula", and the inner behavior is strictly depending on the Python's syntax. Supporting all math is not the scope of this library at this point. I think it should be another project as it eventually requires us to develop another language.


Replies for enumerated questions (according to the newest implementation):

The inability to convert integral expressions

Tracked by #142 where we are discussing about supporting (only) definite integrals.

The inability to display matrices

Supported by #143

The inability to display some special types of letters (such as x_hat, x_tilde, x_bar, etc.)

These forms require discussion. Specifically, we need to handle several ambiguity (e.g., is x_y_bar $\bar{x_y}$ or $\bar{x}_y$?

The inability to display some special symbols (such as mod, Hamiltonian operator, partial differential operator, etc.)

These operators can't be basically expressed in Python without extra syntax and/or other library support.
mod is technically expressed as the % operator, although this operator returns the remainder and doesn't express the modulo arithmetic.

Some strange output (pictured), where the "∈" symbol does not conform to the specification

Could you provide what is the desired output of sqrt(sum(x**2 for x in coordinates))?
I think $\sum_{x \in X} f(x)$ is a usual notation of summations and this output looks correct as far as the library relies on the user's input (regardless of the strange appearance of $coordinates$).

Only the power form can be written, not the superscript (what if I want to write this symbol: "$x^{b}$")

x**b is converted to $x^b$. pow(x, b) can also be expanded to $x^b$ with expand_functions option, which are not correctly documented yet.

It is unable to display the n-th root sign, maybe we can let "$\sqrt[n]{x}$" be represented as "pow(x, 1/n)"

Good catch, we could support this syntax as an option.

Why not add expressions in "math" library to this library, like "exp()", "angle()", "log(m, n)", etc., which can make it more functional?

  • exp ... supported (by expand_functions)
  • angle ... there is no such function in math
  • log(m, n) ... lack of implementation: this form should be supported.

@Liskelleo
Copy link
Author

Liskelleo commented Dec 26, 2022

@odashi Thank you for your reply^^

These forms require discussion. Specifically, we need to handle several ambiguity (e.g., is x_y_bar $\bar{x_y}$ or $\bar{x}_y$?)

Personally, I think this should be judged according to the brackets in user's expression. For example, $\bar{x_y}$ should be written as "(x_y)_bar" and $\bar{x}_y$ should be written as "(x_bar)_y".

These operators can't be basically expressed in Python without extra syntax and/or other library support.

This is correct, and what I found later is that differential (derivative), integral and limit can be calculated in SymPy library and converted to latex expression to display. Even the functions of the matrices can be found in the SymPy library.
Here is an example of partial differential calculus in SymPy library:
import sympy as sy
x, y, z = sy.symbols('x y z')
f1 = sy.exp(x*y*z)
deriv = sy.Derivative(f1, x, y, y, z, 4)
print("{}={}".format(sy.latex(deriv),sy.latex(deriv.doit())))
output: "\frac{\partial^{7}}{\partial z^{4}\partial y^{2}\partial x} e^{x y z}=x^{3} y^{2} \left(x^{3} y^{3} z^{3} + 14 x^{2} y^{2} z^{2} + 52 x y z + 48\right) e^{x y z}", which can be displayed by LaTeX like this:
$$\frac{\partial^{7}}{\partial z^{4}\partial y^{2}\partial x} e^{x y z}=x^{3} y^{2} \left(x^{3} y^{3} z^{3} + 14 x^{2} y^{2} z^{2} + 52 x y z + 48\right) e^{x y z}$$

Could you provide what is the desired output of sqrt(sum(x**2 for x in coordinates))?

For this question, I think that the required output of sqrt(sum(x**2 for x in coordinates)) should be $$\sqrt{\sum_{x} {coordinates}^{2}}$$, where coordinates is taken as a variable.

x**b is converted to $x^b$. pow(x, b) can also be expanded to $x^b$ with expand_functions option, which are not correctly documented yet.

I don't think we can distinguish superscript from power here. Consider that some special superscripts are not used for calculation, but just a sign, such as x' or x*. Even $x^b$ may not necessarily represent the power b of x, but it may also be a sign.

angle ... there is no such function in math

Sorry, I remember wrong, but there is an "angle ()" method in NumPy. In fact, I want to propose ideas on how to switch radians and angles through this point. Generally, the radian system is used as the default setting, but what if the user want to enter an angle?

P.S.: Have you considered using a universal parameter to access most or all of the functions? In this way, the formula editor I am completing can be operated in an integrated way, instead of adjusting the parameter as needed.

@odashi
Copy link
Collaborator

odashi commented Dec 26, 2022

(x_y)_bar

We couldn't express this in Python because this is out of the correct syntax. As I commented above, this library doesn't aim to introduce any additional syntax on the level of the Python interpreter.

SymPy

SymPy is a library that constructs their own syntax object on Python, and it requires executing the code to construct the syntax tree. Latexify is a library that converts the existing Python code to a corresponding expression, and there is a difference of the supported domains between these libraries.

Latexify can also provide some functionality to parse SymPy objects as a plugin, but it requires a bunch of development IMO.

a universal parameter to access most or all of the functions

It doesn't fully make sense to me, could you provide some examples?
If this means exposing all functions in this package, it is technically possible already, because Python doesn't hide any objects inside the library.
However, we don't provide any supports for inner objects because it will change randomly during development.

$\sqrt{\sum_x coordinates^2}$

This is not a correct representation of the original expression. coordinates must be a collection of something (typically a set of integers) here, and the representation above doesn't have meaning. E.g., imagine replacing coordinates to {1,2,3}, then we get $\sqrt{\sum_x \{1,2,3\}^2}$

angle

It looks (if we provide some support) angle(z) should be converted to $\arg z$ by default and angle(z, deg=True) to $\frac{180}{\pi} \arg z$, because the return value is a scalar that can be passed to another function. There is no syntax to represent the degree units in Python, and it is better not to provide any additional interpretation for this kind of functions.
E.g., numpy.sin(numpy.angle(z, deg=True)) is always a correct syntax (regardless of whether the result is intended or not).

@Liskelleo
Copy link
Author

Liskelleo commented Dec 26, 2022

(x_y)_bar

It's weird that the library can convert "f=x[ab]" to "\mathrm{f} = {x_{ab}}" but "x[ab]" is out of the correct syntax in Python and because so it cannot appear in function name but only in the expression of the function. So why not let "(x_y)_bar" or "x[y]_bar" exist in the expression?

@odashi
Copy link
Collaborator

odashi commented Dec 26, 2022

x[ab] is a correct syntax in Python that represents array indexing and latexify has a rule to convert this syntax to $x_{\mathrm{ab}}$. Neither (x_y)_bar nor x[y]_bar has any meanings in Python.

@odashi
Copy link
Collaborator

odashi commented Dec 26, 2022

More specifically:

  • Only Identifier is allowed as the function names. Identifier here must be a name consisting of only alphanumeric characters or "_". The underscore is just one of the characters that can construct identifiers, and doesn't have any special meanings in the Python's syntax itself.

  • Python doesn't have a syntax rule expr expr.
    Here is what happens when we gave (x_y)_bar to the (bottom-up) parser:

    1. "(" "x_y" ")" "_bar"
    2. "(" identifier ")" "_bar"
    3. "(" expr ")" "_bar"
    4. expr "_bar"
    5. expr identifier
    6. expr expr ERROR

    or x[y]_bar:

    1. "x" "[" "y" "]" "_bar"
    2. identifier "[" "y" "]" "_bar"
    3. expr "[" "y" "]" "_bar"
    4. expr "[" identifier "]" "_bar"
    5. expr "[" expr "]" "_bar"
    6. index "_bar"
    7. expr "_bar"
    8. expr identifier
    9. expr expr ERROR

@Liskelleo
Copy link
Author

Liskelleo commented Dec 27, 2022

Identifier

Oh, I got it. Then I think we could let "$\bar{x_y}$" be written as "x_y__bar", "$\bar{x}_y$" be written as "x__bar_y", and "$x_{\bar{y}}$" be written as "x_y_bar", where "__bar" could be interpreted as a header of the whole part preceding it and "_bar" could be interpreted as a header of the sigle part preceding it.
And in addition, I also wonder if we can apply similar type of rule to function names which cannot be expressed as "x[ab]", we can make it through inputting "x__ab", where "__" represents that then length of the lower corner mark is greater than 1 and "$x_{a_{b}}$" can be written as "x_a_b".

P.S.: What do you think of the previous comments on the upper corner mark ("superscript" as I commented) and power?
image

@odashi
Copy link
Collaborator

odashi commented Dec 27, 2022

Then I think we could let "$\bar{x_y}$" be written as "x_y__bar"

This is still ambiguous as it can be interpreted as both $\bar{x_y}$ and $x_{\bar{y}}$.
This approach ultimately requires to develop another context-free grammar on the identifier names since the resulting LaTeX could become any kind of tree structure. It is somewhat not practical to introduce such a complex rule into this library to realize only this kind of feature.

In the newest implementation of this library, we removed automatic subscripting too for the same reason. Now every identifier is converted to either:

  • x ... single character
  • \mathrm{abc\_def} ... multiple characters with/without "_"
  • \alpha ... Math symbols (only when use_math_symbols=True)

and the library doesn't have any other intelligent behavior anymore.

We already have a functionality of substituting identifiers into another identifier before processing the codegen. I thought it is better to implement a similar functionality that replaces the specified identifiers with the final LaTeX.

superscript

I don't plan to provide a way to write user-defined superscripts for the same reason.

Btw, we are also discussing to introduce the capability of plugins in #165 . Although we haven't determined any designs yet, this feature may introduce an ability to apply user-defined conversion rules. I guessed most features in this thread could be implemented as such, but it is not desirable to introduce them as the default behavior of this library.

@Liskelleo
Copy link
Author

This is still ambiguous as it can be interpreted as both $x_{\bar{y}}$ and $\bar{x_{y}}$ .

How come? Can you explain it?

@odashi
Copy link
Collaborator

odashi commented Dec 27, 2022

Ah sorry, your explanation above covers all cases if there are only 2 variables in the same identifier. At a glance there's no ambiguity, but there are 2 essential points:

  • Lack of some completeness: if we assumed x_y_z is converted to $x_{y_z}$, there are no rule to express $x_\bar{y_z}$.
  • This is a new "grammar" on the identifier names that users need to understand. As I mentioned above, introducing new syntax by default is not a purpose of this library. It is still welcome if it is provided as a plugin.

@Liskelleo
Copy link
Author

Liskelleo commented Dec 28, 2022

if we assumed "x_y_z" is converted to x_{y_z} there are no rule to express x_\bar{y_z}.

Actually, x_{y_z} can be written as "x_y_z" and x_\bar{y_z} can be written as "x__y_z__bar", where "__bar" could be interpreted as a header of the whole part behind the last symbol "__" that parser recognize.

P.S.: Is this a new "grammar"? To be honest, I'd rather think this is more like a rule of the definition of some formular identifiers in this library instead of a "new syntax", because people use it only when they use this library and this so called "rule" (as from my perspective) can actually be recognized by parser. Unlike the "[ ]-grammar of the subscript", this so called "rule" doesn't need any other rules to convert this to the LaTeX syntax and don't raise any error when being interpreted individually.

What do you think of the things above? Btw, it's interesting for me to think about all these things. Thank for answering patiently.

@Liskelleo
Copy link
Author

Liskelleo commented Dec 30, 2022

Anyway, I think that compared to the subscript, the superscript is harder to be expressed as an identifier within the scope of correct python syntax because python legal identifiers include only alphanumeric or underscores, which means we should actually introduce a new syntax by default to convert superscript to LaTeX code. But in order to express subscript correctly, we may just make a rule about the underscores like above. @odashi

@odashi
Copy link
Collaborator

odashi commented Dec 30, 2022

As I noted above, we wouldn't introduce any other default rules than I mentioned in #173 (comment) , and we already avoided even subscripting from the newest implementation (except the syntax of indexing). This is because (1) users eventually need to learn about unnecessary knowledge to use this library, and (2) it doesn't work with existing code. The current implementation aims to convert every function into LaTeX through @latexify.algorithmic. Since underscores are usually not used in other code to represent the semantic we discussed here, introducing additional default rules around underscores probably break the appearance of many functions. This includes even subscripting: foo_bar might be converted to $\mathrm{foo}_{\mathrm{bar}}$ if we introduced a subscripting rule, but this is not a desired behavior in most cases.

It is more suitable to introduce such functionality as an optional plugin. If you are interested in implementing such mechanism, feel free to fork this library and try to make a pull request.

@Liskelleo
Copy link
Author

Alright then. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants