diff --git a/CHANGELOG.md b/CHANGELOG.md index 34c40e40..3548fa6e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,14 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/) a ## [Unreleased] +### Added + +- The class declarations returned as the result of the indexing operation now have their superclass as a field. It is returned as an array of constants. If the superclass is anything other than a constant lookup, then it raises an error. + +### Changed + +- The `nesting` field on the results of the indexing operation is no longer a single flat array. Instead it is an array of arrays, where each array is a single nesting level. This more accurately reflects the nesting of the nodes in the tree. For example, `class Foo::Bar::Baz; end` would result in `[Foo, Bar, Baz]`, but that incorrectly implies that you can see constants at each of those levels. Now this would result in `[[Foo, Bar, Baz]]` to indicate that it can see either the top level or constants within the scope of `Foo::Bar::Baz` only. + ## [6.0.0] - 2023-02-10 ### Added diff --git a/README.md b/README.md index 500d5fad..03942d46 100644 --- a/README.md +++ b/README.md @@ -29,6 +29,7 @@ It is built with only standard library dependencies. It additionally ships with - [SyntaxTree.format(source)](#syntaxtreeformatsource) - [SyntaxTree.mutation(&block)](#syntaxtreemutationblock) - [SyntaxTree.search(source, query, &block)](#syntaxtreesearchsource-query-block) + - [SyntaxTree.index(source)](#syntaxtreeindexsource) - [Nodes](#nodes) - [child_nodes](#child_nodes) - [copy(**attrs)](#copyattrs) @@ -347,6 +348,10 @@ This function yields a new mutation visitor to the block, and then returns the i This function takes an input string containing Ruby code, an input string containing a valid Ruby `in` clause expression that can be used to match against nodes in the tree (can be generated using `stree expr`, `stree match`, or `Node#construct_keys`), and a block. Each node that matches the given query will be yielded to the block. The block will receive the node as its only argument. +### SyntaxTree.index(source) + +This function takes an input string containing Ruby code and returns a list of all of the class declarations, module declarations, and method definitions within a file. Each of the entries also has access to its associated comments. This is useful for generating documentation or index information for a file to support something like go-to-definition. + ## Nodes There are many different node types in the syntax tree. They are meant to be treated as immutable structs containing links to child nodes with minimal logic contained within their implementation. However, for the most part they all respond to a certain set of APIs, listed below. diff --git a/lib/syntax_tree/index.rb b/lib/syntax_tree/index.rb index ab2460dd..c6973847 100644 --- a/lib/syntax_tree/index.rb +++ b/lib/syntax_tree/index.rb @@ -20,11 +20,12 @@ def initialize(line, column) # This entry represents a class definition using the class keyword. class ClassDefinition - attr_reader :nesting, :name, :location, :comments + attr_reader :nesting, :name, :superclass, :location, :comments - def initialize(nesting, name, location, comments) + def initialize(nesting, name, superclass, location, comments) @nesting = nesting @name = name + @superclass = superclass @location = location @comments = comments end @@ -176,30 +177,101 @@ def location_for(iseq) Location.new(code_location[0], code_location[1]) end + def find_constant_path(insns, index) + index -= 1 while insns[index].is_a?(Integer) + insn = insns[index] + + if insn.is_a?(Array) && insn[0] == :opt_getconstant_path + # In this case we're on Ruby 3.2+ and we have an opt_getconstant_path + # instruction, so we already know all of the symbols in the nesting. + [index - 1, insn[1]] + elsif insn.is_a?(Symbol) && insn.match?(/\Alabel_\d+/) + # Otherwise, if we have a label then this is very likely the + # destination of an opt_getinlinecache instruction, in which case + # we'll walk backwards to grab up all of the constants. + names = [] + + index -= 1 + until insns[index].is_a?(Array) && + insns[index][0] == :opt_getinlinecache + if insns[index].is_a?(Array) && insns[index][0] == :getconstant + names.unshift(insns[index][1]) + end + + index -= 1 + end + + [index - 1, names] + else + [index, []] + end + end + def index_iseq(iseq, file_comments) results = [] queue = [[iseq, []]] while (current_iseq, current_nesting = queue.shift) - current_iseq[13].each_with_index do |insn, index| - next unless insn.is_a?(Array) + line = current_iseq[8] + insns = current_iseq[13] + + insns.each_with_index do |insn, index| + case insn + when Integer + line = insn + next + when Array + # continue on + else + # skip everything else + next + end case insn[0] when :defineclass _, name, class_iseq, flags = insn + next_nesting = current_nesting.dup + + # This is the index we're going to search for the nested constant + # path within the declaration name. + constant_index = index - 2 + + # This is the superclass of the class being defined. + superclass = [] + + # If there is a superclass, then we're going to find it here and + # then update the constant_index as necessary. + if flags & VM_DEFINECLASS_FLAG_HAS_SUPERCLASS > 0 + constant_index, superclass = + find_constant_path(insns, index - 1) + + if superclass.empty? + raise NotImplementedError, + "superclass with non constant path on line #{line}" + end + end + + if (_, nesting = find_constant_path(insns, constant_index)) + # If there is a constant path in the class name, then we need to + # handle that by updating the nesting. + next_nesting << (nesting << name) + else + # Otherwise we'll add the class name to the nesting. + next_nesting << [name] + end if flags == VM_DEFINECLASS_TYPE_SINGLETON_CLASS # At the moment, we don't support singletons that aren't # defined on self. We could, but it would require more # emulation. - if current_iseq[13][index - 2] != [:putself] + if insns[index - 2] != [:putself] raise NotImplementedError, "singleton class with non-self receiver" end elsif flags & VM_DEFINECLASS_TYPE_MODULE > 0 location = location_for(class_iseq) results << ModuleDefinition.new( - current_nesting, + next_nesting, name, location, EntryComments.new(file_comments, location) @@ -207,14 +279,15 @@ def index_iseq(iseq, file_comments) else location = location_for(class_iseq) results << ClassDefinition.new( - current_nesting, + next_nesting, name, + superclass, location, EntryComments.new(file_comments, location) ) end - queue << [class_iseq, current_nesting + [name]] + queue << [class_iseq, next_nesting] when :definemethod location = location_for(insn[2]) results << MethodDefinition.new( @@ -259,24 +332,43 @@ def initialize visit_methods do def visit_class(node) - name = visit(node.constant).to_sym + names = visit(node.constant) + nesting << names + location = Location.new(node.location.start_line, node.location.start_column) + superclass = + if node.superclass + visited = visit(node.superclass) + + if visited == [[]] + raise NotImplementedError, "superclass with non constant path" + end + + visited + else + [] + end + results << ClassDefinition.new( nesting.dup, - name, + names.last, + superclass, location, comments_for(node) ) - nesting << name super nesting.pop end def visit_const_ref(node) - node.constant.value + [node.constant.value.to_sym] + end + + def visit_const_path_ref(node) + visit(node.parent) << node.constant.value.to_sym end def visit_def(node) @@ -302,18 +394,19 @@ def visit_def(node) end def visit_module(node) - name = visit(node.constant).to_sym + names = visit(node.constant) + nesting << names + location = Location.new(node.location.start_line, node.location.start_column) results << ModuleDefinition.new( nesting.dup, - name, + names.last, location, comments_for(node) ) - nesting << name super nesting.pop end @@ -327,6 +420,10 @@ def visit_statements(node) @statements = node super end + + def visit_var_ref(node) + [node.value.value.to_sym] + end end private diff --git a/test/index_test.rb b/test/index_test.rb index 6bb83881..60c51d9d 100644 --- a/test/index_test.rb +++ b/test/index_test.rb @@ -7,14 +7,14 @@ class IndexTest < Minitest::Test def test_module index_each("module Foo; end") do |entry| assert_equal :Foo, entry.name - assert_empty entry.nesting + assert_equal [[:Foo]], entry.nesting end end def test_module_nested index_each("module Foo; module Bar; end; end") do |entry| assert_equal :Bar, entry.name - assert_equal [:Foo], entry.nesting + assert_equal [[:Foo], [:Bar]], entry.nesting end end @@ -28,14 +28,65 @@ def test_module_comments def test_class index_each("class Foo; end") do |entry| assert_equal :Foo, entry.name - assert_empty entry.nesting + assert_equal [[:Foo]], entry.nesting + end + end + + def test_class_paths_2 + index_each("class Foo::Bar; end") do |entry| + assert_equal :Bar, entry.name + assert_equal [%i[Foo Bar]], entry.nesting + end + end + + def test_class_paths_3 + index_each("class Foo::Bar::Baz; end") do |entry| + assert_equal :Baz, entry.name + assert_equal [%i[Foo Bar Baz]], entry.nesting end end def test_class_nested index_each("class Foo; class Bar; end; end") do |entry| assert_equal :Bar, entry.name - assert_equal [:Foo], entry.nesting + assert_equal [[:Foo], [:Bar]], entry.nesting + end + end + + def test_class_paths_nested + index_each("class Foo; class Bar::Baz::Qux; end; end") do |entry| + assert_equal :Qux, entry.name + assert_equal [[:Foo], %i[Bar Baz Qux]], entry.nesting + end + end + + def test_class_superclass + index_each("class Foo < Bar; end") do |entry| + assert_equal :Foo, entry.name + assert_equal [[:Foo]], entry.nesting + assert_equal [:Bar], entry.superclass + end + end + + def test_class_path_superclass + index_each("class Foo::Bar < Baz::Qux; end") do |entry| + assert_equal :Bar, entry.name + assert_equal [%i[Foo Bar]], entry.nesting + assert_equal %i[Baz Qux], entry.superclass + end + end + + def test_class_path_superclass_unknown + source = "class Foo < bar; end" + + assert_raises NotImplementedError do + Index.index(source, backend: Index::ParserBackend.new) + end + + if defined?(RubyVM::InstructionSequence) + assert_raises NotImplementedError do + Index.index(source, backend: Index::ISeqBackend.new) + end end end @@ -56,7 +107,7 @@ def test_method def test_method_nested index_each("class Foo; def foo; end; end") do |entry| assert_equal :foo, entry.name - assert_equal [:Foo], entry.nesting + assert_equal [[:Foo]], entry.nesting end end @@ -77,7 +128,7 @@ def test_singleton_method def test_singleton_method_nested index_each("class Foo; def self.foo; end; end") do |entry| assert_equal :foo, entry.name - assert_equal [:Foo], entry.nesting + assert_equal [[:Foo]], entry.nesting end end