class Raven::Processor::UTF8Conversion

Constants

REPLACE

Slightly misnamed - actually just removes any bytes with invalid encoding Previously, our JSON backend required UTF-8. Since we now use the built-in JSON, we can use any encoding, but it must be valid anyway so we can do things like call match and slice on strings

Public Instance Methods

process(value) click to toggle source
# File lib/raven/processor/utf8conversion.rb, line 9
def process(value)
  case value
  when Hash
    !value.frozen? ? value.merge!(value) { |_, v| process v } : value.merge(value) { |_, v| process v }
  when Array
    !value.frozen? ? value.map! { |v| process v } : value.map { |v| process v }
  when Exception
    return value if value.message.valid_encoding?

    clean_exc = value.class.new(remove_invalid_bytes(value.message))
    clean_exc.set_backtrace(value.backtrace)
    clean_exc
  when String
    # Encoding::BINARY / Encoding::ASCII_8BIT is a special binary encoding.
    # valid_encoding? will always return true because it contains all codepoints,
    # so instead we check if it only contains actual ASCII codepoints, and if
    # not we assume it's actually just UTF8 and scrub accordingly.
    if value.encoding == Encoding::BINARY && !value.ascii_only?
      value = value.dup
      value.force_encoding(Encoding::UTF_8)
    end
    return value if value.valid_encoding?

    remove_invalid_bytes(value)
  else
    value
  end
end

Private Instance Methods

remove_invalid_bytes(string) click to toggle source
# File lib/raven/processor/utf8conversion.rb, line 43
def remove_invalid_bytes(string)
  string.scrub(REPLACE)
end