IndexError: list index out of range #85

jjconnolly85 · 2024-05-31T11:18:44Z

Hi there,
thank you in advance for the help.

I am trying to run a simple de-duplication process but i am encountering the: IndexError: list index out of range

Sample data

data = [("1", "arte musica poetica"),
("2", "arte-musica-poetica UG"),
("3", "another entity"),
("4", "another entity ltd"),
("5", "random")]

cols = ["id", "name"]
df = spark.createDataFrame(data, cols)

autolinker = AutoLinker()

autolinker.auto_link(
data=df,
)