Skip to content

Commit

Permalink
LSP for python (#200)
Browse files Browse the repository at this point in the history
* Update LLM dependencies in pyproject.toml to latest versions

* Update LLM dependencies in pyproject.toml to latest versions

* Update test assertion for exception message in AICaller test

* Update litellm dependency to use specific Git repository

* Remove unneeded model condition and adjust streaming parameter for o1 models

* poetry

* remove wrong reference to line numbers

* insert right reference to line numbers

* minor adjustments

* minor adjustments

* Add implementation of language server and LSP protocol handler for Python

* Add implementation of language server and LSP protocol handler for Python
  • Loading branch information
mrT23 authored Oct 31, 2024
1 parent ab163ee commit 738bf47
Show file tree
Hide file tree
Showing 42 changed files with 12,238 additions and 12 deletions.
25 changes: 25 additions & 0 deletions cover_agent/lsp_logic/file_map/Acknowledgment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@

# Credits

These queries, taken from [Aider](https://github.com/paul-gauthier/aider/blob/main/aider/queries/README.md) project (Apache-2.0 license), are modified versions of the tags.scm files from these open source.
If you find this work valuable, please consider also supporting the original Aider project with a star.

tree-sitter language implementations:

* [https://github.com/tree-sitter/tree-sitter-c](https://github.com/tree-sitter/tree-sitter-c) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-c-sharp](https://github.com/tree-sitter/tree-sitter-c-sharp) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-cpp](https://github.com/tree-sitter/tree-sitter-cpp) — licensed under the MIT License.
* [https://github.com/Wilfred/tree-sitter-elisp](https://github.com/Wilfred/tree-sitter-elisp) — licensed under the MIT License.
* [https://github.com/elixir-lang/tree-sitter-elixir](https://github.com/elixir-lang/tree-sitter-elixir) — licensed under the Apache License, Version 2.0.
* [https://github.com/elm-tooling/tree-sitter-elm](https://github.com/elm-tooling/tree-sitter-elm) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-go](https://github.com/tree-sitter/tree-sitter-go) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-java](https://github.com/tree-sitter/tree-sitter-java) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-javascript](https://github.com/tree-sitter/tree-sitter-javascript) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-ocaml](https://github.com/tree-sitter/tree-sitter-ocaml) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-php](https://github.com/tree-sitter/tree-sitter-php) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-python](https://github.com/tree-sitter/tree-sitter-python) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-ql](https://github.com/tree-sitter/tree-sitter-ql) — licensed under the MIT License.
* [https://github.com/r-lib/tree-sitter-r](https://github.com/r-lib/tree-sitter-r) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-ruby](https://github.com/tree-sitter/tree-sitter-ruby) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-rust](https://github.com/tree-sitter/tree-sitter-rust) — licensed under the MIT License.
* [https://github.com/tree-sitter/tree-sitter-typescript](https://github.com/tree-sitter/tree-sitter-typescript) — licensed under the MIT License.
144 changes: 144 additions & 0 deletions cover_agent/lsp_logic/file_map/file_map.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
import os
from pathlib import Path

from grep_ast import TreeContext
from grep_ast.parsers import PARSERS, filename_to_lang
# from pygments.lexers import guess_lexer_for_filename
# from pygments.token import Token
from tree_sitter_languages import get_language, get_parser

from cover_agent.lsp_logic.file_map.queries.get_queries import get_queries_scheme


class FileMap:
"""
This class is used to summarize the content of a file using tree-sitter queries.
Supported languages: C, C++, C#, elisp, elixir, go, java, javascript, ocaml, php, python, ql, ruby, rust, typescript
"""

def __init__(self, fname_full_path: str, parent_context=True, child_context=False,
header_max=0, margin=0, project_base_path: str = None):
self.fname_full_path = fname_full_path
self.project_base_path = project_base_path
if project_base_path:
self.fname_rel = os.path.relpath(fname_full_path, project_base_path)
else:
self.fname_rel = fname_full_path
self.main_queries_path = Path(__file__).parent.parent / 'queries'
if not os.path.exists(fname_full_path):
print(f"File {fname_full_path} does not exist")
with open(fname_full_path, "r") as f:
code = f.read()
self.code = code.rstrip("\n") + "\n"
self.parent_context = parent_context
self.child_context = child_context
self.header_max = header_max
self.margin = margin

def summarize(self):
query_results = self.get_query_results()
summary_str = self.query_processing(query_results)
return summary_str

def render_file_summary(self, lines_of_interest: list):
code = self.code
fname_rel = self.fname_rel
context = TreeContext(
fname_rel,
code,
color=False,
line_number=True, # number the lines (1-indexed)
parent_context=self.parent_context,
child_context=self.child_context,
last_line=False,
margin=self.margin,
mark_lois=False,
loi_pad=0,
header_max=self.header_max, # max number of lines to show in a function header
show_top_of_file_parent_scope=False,
)

context.lines_of_interest = set()
context.add_lines_of_interest(lines_of_interest)
context.add_context()
res = context.format()
return res

def query_processing(self, query_results: list):
if not query_results:
return ""

output = ""
def_lines = [q['line'] for q in query_results if q['kind'] == "def"]
output += "\n"
output += query_results[0]['fname'] + ":\n"
output += self.render_file_summary(def_lines)
return output

def get_query_results(self):
fname_rel = self.fname_rel
code = self.code
lang = filename_to_lang(fname_rel)
if not lang:
return

try:
language = get_language(lang)
parser = get_parser(lang)
except Exception as err:
print(f"Skipping file {fname_rel}: {err}")
return

query_scheme_str = get_queries_scheme(lang)
tree = parser.parse(bytes(code, "utf-8"))

# Run the queries
query = language.query(query_scheme_str)
captures = list(query.captures(tree.root_node))

# Parse the results into a list of "def" and "ref" tags
visited_set = set()
results = []
for node, tag in captures:
if tag.startswith("name.definition."):
kind = "def"
elif tag.startswith("name.reference."):
kind = "ref"
else:
continue

visited_set.add(kind)
result = dict(
fname=fname_rel,
name=node.text.decode("utf-8"),
kind=kind,
line=node.start_point[0],
)
results.append(result)

if "ref" in visited_set:
return results, captures
if "def" not in visited_set:
return results, captures

## currently we are interested only in defs
# # We saw defs, without any refs
# # Some files only provide defs (cpp, for example)
# # Use pygments to backfill refs
# try:
# lexer = guess_lexer_for_filename(fname, code)
# except Exception:
# return
#
# tokens = list(lexer.get_tokens(code))
# tokens = [token[1] for token in tokens if token[0] in Token.Name]
#
# for t in tokens:
# result = dict(
# fname=fname,
# name=t,
# kind="ref",
# line=-1,
# )
# results.append(result)
return results, captures
13 changes: 13 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/get_queries.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import os
from pathlib import Path


def get_queries_scheme(lang: str) -> str:
try:
# Load the relevant queries
curr_path = Path(__file__).parent
path = os.path.join(curr_path, f"tree-sitter-{lang}-tags.scm")
with open(path, "r") as f:
return f.read()
except KeyError:
return ""
9 changes: 9 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-c-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
(struct_specifier name: (type_identifier) @name.definition.class body:(_)) @definition.class

(declaration type: (union_specifier name: (type_identifier) @name.definition.class)) @definition.class

(function_declarator declarator: (identifier) @name.definition.function) @definition.function

(type_definition declarator: (type_identifier) @name.definition.type) @definition.type

(enum_specifier name: (type_identifier) @name.definition.type) @definition.type
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
(class_declaration
name: (identifier) @name.definition.class
) @definition.class

(class_declaration
bases: (base_list (_) @name.reference.class)
) @reference.class

(interface_declaration
name: (identifier) @name.definition.interface
) @definition.interface

(interface_declaration
bases: (base_list (_) @name.reference.interface)
) @reference.interface

(method_declaration
name: (identifier) @name.definition.method
) @definition.method

(object_creation_expression
type: (identifier) @name.reference.class
) @reference.class

(type_parameter_constraints_clause
target: (identifier) @name.reference.class
) @reference.class

(type_constraint
type: (identifier) @name.reference.class
) @reference.class

(variable_declaration
type: (identifier) @name.reference.class
) @reference.class

(invocation_expression
function:
(member_access_expression
name: (identifier) @name.reference.send
)
) @reference.send

(namespace_declaration
name: (identifier) @name.definition.module
) @definition.module
15 changes: 15 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-cpp-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
(struct_specifier name: (type_identifier) @name.definition.class body:(_)) @definition.class

(declaration type: (union_specifier name: (type_identifier) @name.definition.class)) @definition.class

(function_declarator declarator: (identifier) @name.definition.function) @definition.function

(function_declarator declarator: (field_identifier) @name.definition.function) @definition.function

(function_declarator declarator: (qualified_identifier scope: (namespace_identifier) @scope name: (identifier) @name.definition.method)) @definition.method

(type_definition declarator: (type_identifier) @name.definition.type) @definition.type

(enum_specifier name: (type_identifier) @name.definition.type) @definition.type

(class_specifier name: (type_identifier) @name.definition.class) @definition.class
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
;; defun/defsubst
(function_definition name: (symbol) @name.definition.function) @definition.function

;; Treat macros as function definitions for the sake of TAGS.
(macro_definition name: (symbol) @name.definition.function) @definition.function

;; Match function calls
(list (symbol) @name.reference.function) @reference.function
54 changes: 54 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-elixir-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
; Definitions

; * modules and protocols
(call
target: (identifier) @ignore
(arguments (alias) @name.definition.module)
(#match? @ignore "^(defmodule|defprotocol)$")) @definition.module

; * functions/macros
(call
target: (identifier) @ignore
(arguments
[
; zero-arity functions with no parentheses
(identifier) @name.definition.function
; regular function clause
(call target: (identifier) @name.definition.function)
; function clause with a guard clause
(binary_operator
left: (call target: (identifier) @name.definition.function)
operator: "when")
])
(#match? @ignore "^(def|defp|defdelegate|defguard|defguardp|defmacro|defmacrop|defn|defnp)$")) @definition.function

; References

; ignore calls to kernel/special-forms keywords
(call
target: (identifier) @ignore
(#match? @ignore "^(def|defp|defdelegate|defguard|defguardp|defmacro|defmacrop|defn|defnp|defmodule|defprotocol|defimpl|defstruct|defexception|defoverridable|alias|case|cond|else|for|if|import|quote|raise|receive|require|reraise|super|throw|try|unless|unquote|unquote_splicing|use|with)$"))

; ignore module attributes
(unary_operator
operator: "@"
operand: (call
target: (identifier) @ignore))

; * function call
(call
target: [
; local
(identifier) @name.reference.call
; remote
(dot
right: (identifier) @name.reference.call)
]) @reference.call

; * pipe into function call
(binary_operator
operator: "|>"
right: (identifier) @name.reference.call) @reference.call

; * modules
(alias) @name.reference.module @reference.module
19 changes: 19 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-elm-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
(value_declaration (function_declaration_left (lower_case_identifier) @name.definition.function)) @definition.function

(function_call_expr (value_expr (value_qid) @name.reference.function)) @reference.function
(exposed_value (lower_case_identifier) @name.reference.function) @reference.function
(type_annotation ((lower_case_identifier) @name.reference.function) (colon)) @reference.function

(type_declaration ((upper_case_identifier) @name.definition.type)) @definition.type

(type_ref (upper_case_qid (upper_case_identifier) @name.reference.type)) @reference.type
(exposed_type (upper_case_identifier) @name.reference.type) @reference.type

(type_declaration (union_variant (upper_case_identifier) @name.definition.union)) @definition.union

(value_expr (upper_case_qid (upper_case_identifier) @name.reference.union)) @reference.union


(module_declaration
(upper_case_qid (upper_case_identifier)) @name.definition.module
) @definition.module
30 changes: 30 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-go-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
(
(comment)* @doc
.
(function_declaration
name: (identifier) @name.definition.function) @definition.function
(#strip! @doc "^//\\s*")
(#set-adjacent! @doc @definition.function)
)

(
(comment)* @doc
.
(method_declaration
name: (field_identifier) @name.definition.method) @definition.method
(#strip! @doc "^//\\s*")
(#set-adjacent! @doc @definition.method)
)

(call_expression
function: [
(identifier) @name.reference.call
(parenthesized_expression (identifier) @name.reference.call)
(selector_expression field: (field_identifier) @name.reference.call)
(parenthesized_expression (selector_expression field: (field_identifier) @name.reference.call))
]) @reference.call

(type_spec
name: (type_identifier) @name.definition.type) @definition.type

(type_identifier) @name.reference.type @reference.type
20 changes: 20 additions & 0 deletions cover_agent/lsp_logic/file_map/queries/tree-sitter-java-tags.scm
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
(class_declaration
name: (identifier) @name.definition.class) @definition.class

(method_declaration
name: (identifier) @name.definition.method) @definition.method

(method_invocation
name: (identifier) @name.reference.call
arguments: (argument_list) @reference.call)

(interface_declaration
name: (identifier) @name.definition.interface) @definition.interface

(type_list
(type_identifier) @name.reference.implementation) @reference.implementation

(object_creation_expression
type: (type_identifier) @name.reference.class) @reference.class

(superclass (type_identifier) @name.reference.class) @reference.class
Loading

0 comments on commit 738bf47

Please sign in to comment.