Skip to content

Conversation

@stenlarsson
Copy link
Contributor

@stenlarsson stenlarsson commented Dec 14, 2025

Rationale for this change

The MakeStructOptions class is not available in GLib/Ruby, and it is used together with the make_struct compute function.

What changes are included in this PR?

This adds the MakeStructOptions class to GLib.

Are these changes tested?

Yes, with Ruby unit tests.

Are there any user-facing changes?

Yes, a new class.

@stenlarsson stenlarsson requested a review from kou as a code owner December 14, 2025 14:46
@github-actions
Copy link

⚠️ GitHub issue #48490 has been automatically assigned in GitHub to PR creator.

@stenlarsson
Copy link
Contributor Author

The class is missing the field_nullability and field_metadata properties because I ran into limitations of the gobject-introspection gem, e.g.: NotImplementedError: TODO: GIArgument(GList)[ghash] -> Ruby. I couldn't figure out what types the properties should have for it to work.

@kou
Copy link
Member

kou commented Dec 20, 2025

Could you push the code that raises NotImplementedError: TODO: GIArgument(GList)[ghash] -> Ruby? I'll improve gobject-introspection gem for the case.

@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from 85e630c to 730c87a Compare December 20, 2025 14:01
@stenlarsson
Copy link
Contributor Author

I pushed some code. I couldn't figure out what types the properties should have, so I added getters and setters instead, but it doesn't work because of the limitations in gobject-introspection. I'm sure there are more problems with this code, since I can't test it. Perhaps there is a better way to do it?

@kou
Copy link
Member

kou commented Dec 21, 2025

Thanks. I understand this situation and I also consider API.

How about the following API?

GArrowMakeStructOptions *
garrow_make_struct_options_new(void);
void
garrow_make_struct_options_add_field(GArrowMakeStructOptions *options, const char *name, gboolean nullability, GHashTable *metadata);

// no getter or
const char *
garrow_make_struct_options_get_field_name(GArrowMakeStructOptions *options, gsize i);
gboolean
garrow_make_struct_options_get_field_nullability(GArrowMakeStructOptions *options, gsize i);
GHashTable *
garrow_make_struct_options_get_field_metadta(GArrowMakeStructOptions *options, gsize i);
GArrowMakeStructOptions *options = garrow_make_struct_options_new();
garrow_make_struct_options_add_field(options, name1, nullability1, metadata1);
garrow_make_struct_options_add_field(options, name2, nullability2, metadata2);
...

@stenlarsson
Copy link
Contributor Author

I implemented your suggested API. I have some code in my own project that automatically generates Ruby code from the function doc, and it wouldn't work with this API. Still, I guess it is the best we can do.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some code in my own project that automatically generates Ruby code from the function doc

Wow! How do you implement it?

BTW, we can provide convenient converter in Ruby like https://github.com/apache/arrow/blob/main/ruby/red-arrow/lib/arrow/equal-options.rb does.

}

enum {
PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep this? I think that we should not provided this because mixing field_names and add_field() isn't expected. For example, add_field(); field_names = [...] removes information provided by the first add_field().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it can be confusing, but I think this is more convenient if you don't care about nullability or metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. How about providing convenient APIs like this in Ruby something like the following?

module Arrow
  class MakeStructOptions
    def field_names=(names)
      raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
      names.each do |name|
        add_field(name, true, nil)
      end
    end
  end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if you call #field_names= twice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Raise an exception?

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 22, 2025
@stenlarsson
Copy link
Contributor Author

Wow! How do you implement it?

The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb

It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code.

However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient.

@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from d665a82 to 8e58c8b Compare December 22, 2025 12:35
@github-actions github-actions bot added awaiting review Awaiting review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Dec 22, 2025
Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project is not open source, but I extracted the relevant parts and put together an example: https://github.com/stenlarsson/arrow-plan/blob/main/test.rb

Thanks. It's very helpful!

It provides syntax sugar to make it easier to create and execute Acero plans. Since we generate code for the compute functions, your IDE can show you which functions exist, and what arguments and options they take. Symbols are used to reference fields, and some common Ruby types are automatically wrapped as well. In my opinion it makes it much easier to read the code.

Good point. I want to add similar features to Red Arrow but I want to use dynamic method definitions instead of code generation because Acero functions can be added dynamically.

Does the dynamic method definition work with IDE integration?

However, this only works if the options classes uses properties. If they use any other API, it becomes inconvenient.

We can add convenient API in Ruby.

For example, Arrow::Function#resolve_options provides a try_convert hook:

def resolve_options(options)
return nil if options.nil?
return options if options.is_a?(FunctionOptions)
arrow_options_class = options_type&.to_class
if arrow_options_class
if arrow_options_class.respond_to?(:try_convert)
arrow_options = arrow_options_class.try_convert(options)
return arrow_options if arrow_options
end
arrow_options = (default_options || arrow_options_class.new)
else
arrow_options = default_options
end
return arrow_options if arrow_options.nil?
options.each do |key, value|
setter = :"#{key}="
next unless arrow_options.respond_to?(setter)
arrow_options.__send__(setter, value)
end
arrow_options
end

Arrow::MakeStructOptions.try_convert can accept {field_names:, field_nullability:, field_metadata:}:

module Arrow
  class MakeStructOptions
    class << self
      def try_convert(value)
        case value
        when Hash
          options = new
          field_names = value[:field_names]
          field_nullability = value[:field_nullability] || []
          field_metadta = value[:field_metadata] || []
          field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        else
          nil
        end
      end
    end
  end
end

[field, ...] can also be acceptable:

module Arrow
  class MakeStructOptions
    class << self
      def try_convert(value)
        case value
        when ::Array
          options = new
          value.each do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        when Hash
          options = new
          field_names = value[:field_names]
          field_nullability = value[:field_nullability] || []
          field_metadta = value[:field_metadata] || []
          field_names.zip(field_nullability, field_metadata) do |name, nullability, metadata|
            options.add_field(name, nullability, metadata)
          end
          options
        else
          nil
        end
      end
    end
  end
end

}

enum {
PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. How about providing convenient APIs like this in Ruby something like the following?

module Arrow
  class MakeStructOptions
    def field_names=(names)
      raise ArgumentError, "mixing #add_field and #field_names= is prohibited" unless n_fields.zero?
      names.each do |name|
        add_field(name, true, nil)
      end
    end
  end
end

@github-actions github-actions bot added awaiting review Awaiting review awaiting changes Awaiting changes and removed awaiting review Awaiting review labels Dec 23, 2025
@github-actions github-actions bot added awaiting review Awaiting review awaiting committer review Awaiting committer review and removed awaiting review Awaiting review awaiting changes Awaiting changes labels Dec 29, 2025
@stenlarsson stenlarsson force-pushed the glib-make-struct-options branch from 83e908a to 323eef7 Compare December 29, 2025 11:59
@stenlarsson
Copy link
Contributor Author

Good point. I want to add similar features to Red Arrow but I want to use dynamic method definitions instead of code generation because Acero functions can be added dynamically.

Yes, that makes sense for something included with Arrow. In my case it doesn't matter since I can generate the code afterwards.

Does the dynamic method definition work with IDE integration?

No, unfortunately IDE integration for Ruby is pretty bad.

We can add convenient API in Ruby.

This could work, but I implemented another solution. This adds field-nullability and field-metadata as boxed properties of type. These aren't automatically converted by gobject-introspection, so to work around this, I implemented the getter and setter methods to the Ruby extension.

If support is added to gobject-introspection in the future, we can just remove the code from the extension without changing the API. I'm not sure what the type annotations for such properties are supposed to look like however.

Copy link
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand your motivation (providing the same API as doc for easy to code generation) but non GObject Introspection friendly API isn't expected.

In general, using GArray/GPtrArray in API is one of "Things to avoid": https://gi.readthedocs.io/en/latest/writingbindableapis.html#arrays

In a general-purpose library, it’s best not to expose GLib array and hash types such as GArray, GPtrArray, GByteArray, GList, GSList, GQueue, and GHashTable in the public API. They are fine for internal libraries, but difficult in general for consumers of introspected libraries to deal with.

GLib's main target is Ruby but I want to keep GLib API usable for other languages such as Vala and Lua.

Can we avoid using this approach (using GArray and GPtrArray properties)?

My suggestion:

  • Provide garrow_make_struct_options_add_field() in GLib
  • Provide Arrow::FunctionOptions#keywords or something for code generation in Ruby
  • Provide Arrow::MakeStructOptions#fields= in Ruby
  • Provide Arrow::MakeStructOptions#keywords (or something) that includes fields in Ruby

}

enum {
PROP_MAKE_STRUCT_OPTIONS_FIELD_NAMES = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Raise an exception?

@stenlarsson
Copy link
Contributor Author

I can understand your motivation (providing the same API as doc for easy to code generation) but non GObject Introspection friendly API isn't expected.

I was hoping that GObject Introspection could support this in the future, but I can also see that it will be very difficult to express the types of the key and values inside a hash table inside an array.

In general, using GArray/GPtrArray in API is one of "Things to avoid": https://gi.readthedocs.io/en/latest/writingbindableapis.html#arrays

In a general-purpose library, it’s best not to expose GLib array and hash types such as GArray, GPtrArray, GByteArray, GList, GSList, GQueue, and GHashTable in the public API. They are fine for internal libraries, but difficult in general for consumers of introspected libraries to deal with.

That makes sense, but we are already using GHashTable for the field metadata, so I thought that was fine:

GHashTable *
garrow_field_get_metadata(GArrowField *field)

GLib's main target is Ruby but I want to keep GLib API usable for other languages such as Vala and Lua.

I agree, and again I was hoping this API could be supported by GObject Introspection in the future, but I see now that it won't.

Can we avoid using this approach (using GArray and GPtrArray properties)?

My suggestion:

  • Provide garrow_make_struct_options_add_field() in GLib
  • Provide Arrow::FunctionOptions#keywords or something for code generation in Ruby
  • Provide Arrow::MakeStructOptions#fields= in Ruby
  • Provide Arrow::MakeStructOptions#keywords (or something) that includes fields in Ruby

I feels like the keywords will be very specific to my use case, and not very useful in general? If you were to dynamically generate code you wouldn't need to know exactly what the options are. You could just accept **options and pass that Arrow::Function#resolve_options.

I think I need to give up on the idea that I can generate code for all functions. How about this:

  • Provide garrow_make_struct_options_add_field() in GLib
  • Remove the field-names property to avoid inconsistency
  • Provide Arrow::MakeStructOptions#try_convert as you suggested earlier
  • In my code: Override the make_struct function with a specific implementation

@kou
Copy link
Member

kou commented Jan 1, 2026

That makes sense, but we are already using GHashTable for the field metadata, so I thought that was fine:

GHashTable *
garrow_field_get_metadata(GArrowField *field)

Ah, I understand. GHashTable is acceptable because there is the (element-type KTYPE VTYPE) annotation:

https://gi.readthedocs.io/en/latest/annotations/giannotations.html#type-signature

But GPtrArray<GHashTable> should be avoided.

I think I need to give up on the idea that I can generate code for all functions. How about this:

* Provide garrow_make_struct_options_add_field() in GLib

* Remove the `field-names` property to avoid inconsistency

* Provide `Arrow::MakeStructOptions#try_convert` as you suggested earlier

* In my code: Override the `make_struct` function with a specific implementation

Thanks for the suggestion. Let's use the approach!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants