Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attachments, Tags to and from xml, Bracket-access for tags, Recursive SimpleTags #28

Open
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

aulemahal
Copy link
Contributor

@aulemahal aulemahal commented Jun 25, 2017

Restarting from my old pull requests, here are all my changes in one request.

  1. Attachments

MKV has now as "attachments" list of Attachment object with all the Mastroka elements: Name, Description, Mime Type and Data. Data is a BytesIO object with the raw binary data of the attachment.
A keyword argument in MKV.init "load_attachments" can turn off the rawData reading, saving some memory space.

  1. Recursive SimpleTags

Like in the Mastroka docs, SimpleTags can contain children SimpleTags.

  1. Tags to and from xml

Convenience functions using xml.etree.ElementTree Elements

  • mkv.tags_from_xml(xmlElement) : Load Tag data from a xml Element and append it to the mkv.tags.
  • xmlElement = mkv.tags_to_xml() : Write a xml element from the Tags in a mkv.

This is useful for saving changed tag data with mkvpropedit. Also, it could be used (as I do) to create a MKV-like object from another type of file (not mkv).

  1. Bracket-accsess for tags

Convenience getitem. MKV object can be bracket-accessed with a targettype (or targettypevalue) and this returns a list of all tags corresponding to this targettype. The sames apply for tags which return a list of simpletags corresponding to the requested name. They both return lists because you could have multiple SimpleTags with the same Name.

Ex.
Return the first tag with a targettypevalue of 50: mkv[50][0]
Return this tag's first simpletag with name 'TITLE': mkv[50][0]['TITLE'][0]

  1. Tag addition and nesting

To easily add SimpleTags to a Tag, you can now use the division operator.
Ex. Adding a SimpleTag ACTOR with string "John Smith" and sub-simpletag CHARACTER with string "Foo Bar" to the first tag with targettypevalue of 50.
mkv[50][0] / SimpleTag('ACTOR', string='John Smith') / SimpleTag('CHARACTER', string='Foo Bar')

This formalism is, I think, useful when using enzyme interactively.

@aulemahal
Copy link
Contributor Author

It seems that the getsizeof() I was doing in some tests about the attachments was not returning the same value as on my machine. It depends on the filesystem I guess. So, instead, I take the integer division by 1000 and compare this approximate value.

enzyme/mkv.py Outdated
@@ -330,7 +338,7 @@ def fromXML(cls, xmlElement):
:type xmlElement: :class:`xml.etree.ElementTree.Element`

"""
targets = Targets.fromXML(xmlElement.find('Targets')) if xmlElement.find('Targets') else Targets()
targets = Targets() if xmlElement.find('Targets') is None else Targets.fromXML(xmlElement.find('Targets'))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference with the previous code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the order of execution of the statement, having xmlElement.find('Targets') == None would raise an error when Targets.fromXML(...) is evaluated. No what we want. Second version tests before evaluating.


def to_dict(self):
info_dict = self.__dict__.copy()
info_dict['duration'] = self.duration.seconds + self.duration.microseconds/1e6
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is self.duration.microseconds/1e6 pep8-valid? I think it needs extra spaces around the /

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked on python.org and it says :

If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies).

I guess that applies here : the / has higher priority than +.

def to_dict(self):
info_dict = self.__dict__.copy()
info_dict['duration'] = self.duration.seconds + self.duration.microseconds/1e6
info_dict['date_utc'] = ('{0:%Y}-{0:%m}-{0:%d} {0:%H}:{0:%M}:{0:%S}:{0:%f}'.format(self.date_utc))[:-3], # The official Mastroka date standard
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not store the date as datetime object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, the idea behind the to_dict function was to have a convenient way to serialize the object, like with JSON. And json.dump() does not accept datetime object. Also, this output is the standard for Matroska.

enzyme/mkv.py Outdated
@@ -547,6 +555,11 @@ def fromelement(cls, element):
return cls(start, hidden, enabled, end, uid, string, language)
return cls(start, hidden, enabled, end, uid)

def to_dict(self):
chap_dict = self.__dict__.copy()
chap_dict['start'] = self.start.seconds + self.start.microseconds/1e6
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same regarding pep8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operation priorities again.

@Diaoul
Copy link
Owner

Diaoul commented Jul 20, 2017

I'm not sure about the bracket access for tags, I find it confusing. mkv[50] isn't something very explicit. However something like mkv.tags[50] looks better.
Same for the div thing, not very intuitive. I can't think of another python object that uses the div operator that way. List extends uses the + operator and it doesn't support nesting. Even BeautifulSoup I think doesn't go that far in tag/xml nesting operations.

@aulemahal
Copy link
Contributor Author

For the div functionality, I was thinking of the pathlib.Path object which use something similar. Perhaps, your are right and it is not intuitive. I'll remove it on next commit.
For the bracket access, I also agree, but I wanted it to be simple as a getItem. I have an idea I'll suggest in next commit.

@aulemahal
Copy link
Contributor Author

There. Removed the div's and simply renamed the __getitem__ by a getTag() function. Everything else is the same. I find it a bit less coherent than the bracket access but, at the same time, more explicit, as you said. mkv.tags[50] would have been the best, but mkv.tags is already a list and I couldn't think of a simple workaround.

@Diaoul
Copy link
Owner

Diaoul commented Jul 25, 2017

Can you remove the last commit and submit a new PR for that change? Also you need to update the unittests accordingly as travis fails.

This reverts commit 30b3a60.
This will be in another pull request.
@aulemahal
Copy link
Contributor Author

Woups. Forgot about the unittests this time... Sorry. But another pull request it will be.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants