如何重构在Ruby中巨大的查找表 - How to refactor huge lookup table in Ruby

- 此内容更新于:2016-02-01
主题:

我有一个方法:和一个巨大的查找表:1800行。不用说,这是税务部门的一位不愿透露姓名的国家。现在,它是书面手动(1800行)账户的不同长度整数和小数返回值范围。没有重写整个国家的税法,这我怎么重构?范围和价值观做每年的基础上改变,我们需要保持向后兼容性。数据库表可以简化问题,而是因为我们处理许多不同的国家,每个国家都有不同的需求和税收代码,创建一个数据库表并不像听起来那么简单。

原文:

I have a method:

def assign_value
   ...
   @obj.value = find_value
end

and a huge lookup table:

def find_value
   if @var > 0 && @var <= 30
      0.4
   elsif @var > 30 && @var <= 50
      0.7
   elsif @var > 50 && @var <= 70
      1.1
   elsif @var > 70 && @var <= 100
      1.5
   elsif @var > 100 && @var <= 140
      2.10
   elsif @var > 140 && @var <= 200
      2.95
   elsif @var > 200 && @var <= 300
      4.35
   elsif @var > 300 && @var <= 400
      6.15
   elsif @var > 400 && @var <= 500
      7.85
   elsif @var > 500 && @var <= 600
      9.65
   ...

end

and so on for 1800 lines.

Needless to say, it's for the tax department of an unnamed country. Right now, it's written manually (all 1800 lines of it) to account the varying length of the integer ranges and decimal return values.

Without rewriting an entire country's tax code, how would I refactor this?

The ranges and values do change on a yearly basis, and we need to maintain backwards compatibility. A database table would simplify matters, but because we're dealing with a number of different countries, each with different requirements and tax codes, creating a database table isn't as straightforward as it sounds.

网友:创建一个表在DB和查使用适当的条款

(原文:Create a table in DB and look it up using appropriate where clause)

网友:我将做一个散列第一:{0..30=>0.4,31..50=>0.7}等。这样你可以删除所有条件。

(原文:I would make a hash first: { 0..30 => 0.4, 31..50 => 0.7 } etc. That way you can remove all conditionals.)

网友:更好的将其保存到数据库

(原文:better save it to DB)

网友:在我看来,改变数据库条目可取比改变代码和推动新版本税法的变化。

(原文:In my view, changing database entries is preferable than changing code and pushing new release for tax code changes.)

网友:不应纳税额如果@var=141?

(原文:No tax payable if @var = 141?)

解决方案:
@MilesStanfield一样的回答非常影响你的。说在我的评论,我将使用一个散列摆脱所有条件(或情况):如果你不改变代码通常可以使用这个。如果是经常变化你可能会考虑使用另一种方法(例如,存储在数据库和管理通过一个用户界面)。更新:现在我不再我的手机我想进一步扩大这个答案一点。看来你是无视@Wand制造商的建议,但我认为你不应该。想象一下你用我的散列。它是方便的,没有任何条件,很容易调整。然而,随着魔杖制造商所指出的那样,每次范围或十进制的变化需要一个开发人员更新代码。这可能是好的,因为你只需要每年做这件事,但它不是最干净的方法。你想要会计师能够更新税率和代码,而不是开发人员。所以你应该创建一个表,包含这些属性。我不确定的范围和小数站在你的例子中,但我希望你明白我的意思:如果你想知道马耳他的税率(country_id:30),与税法40(不管这可能意味着)你会做这样的事情:现在,一个会计可以更新范围,小数或任何其他属性,当这些变化(当然你必须先建立一个CRUD)。注。不介意命名,不知道这些数字代表什么。:)
原文:

@MilesStanfield's answer has pretty much the same impact as yours. As said in my comment, I would use a hash to get rid of all conditions (or cases):

COUNTRY_TAX_RATES = {
  0..30 => 0.4,
  31..50 => 0.7,
  51..70 => 1.1
}

COUNTRY_TAX_RATES[0..30] # 0.4
COUNTRY_TAX_RATES.select { |code| code === 10 }.values.first # 0.4

If you don't change the codes often you could use this. If it does change frequently you might consider using another approach (for example, store in db and manage through an user interface).

UPDATE: Now I'm no longer on my mobile I want to expand this answer a little bit further. It seems you are disregarding @Wand Maker's proposal, but I don't think you should.

Imagine you use my hash. It is convenient, without any conditionals and easy to adjust. However, as Wand Maker points out, everytime either the range or decimal changes you need a developer to update the code. This might be ok, because you only have to do it on a yearly basis, but it is not the cleanest approach.

You want accountants to be able to update the tax rates and codes, instead of developers. So you should probably create a table which contains these attributes. I'm not sure what the ranges and decimals stand for in your example, but I hope you get the idea:

Tax (ActiveRecord) with range_column (Give this an explicit, explanatory name. You could also use min and max range columns), code, rate and country_id

class Tax < ActiveRecord::Base
  serialize :range_column

  belongs_to :country
end

Country has_many :taxes

If you want to know the tax rate for malta (country_id: 30), with tax code 40 (whatever this might mean) you could do something like this:

Tax.select(:tax_rate).joins(:country).find_by(country_id: 30, range_column: 40).tax_rate # you might need a between statement instead, can't check right now if querying a serialized hash works like this

Now an accountant can update the ranges, decimals or any other attribute when these change (of course you have to build a CRUD for it first).

P.s. Don't mind the naming, not sure what these numbers represent. :)

网友:我认为这个方法是路要走。我们也会在数据库中存储的值的选择和发送json如果需要出现。谢谢你彼得。

(原文:I think this method is the way to go. We'd also get the option of storing the values in the database and sending it as json if the need occurs. Thanks Peter.)

网友:我读过你的编辑,但我也只是建立了一个线性回归脚本,并发现一个潜在的模式,一个1600块的值。现在我让会计师事务所之间的撕裂手动曲柄每年有1800行变化(非常有趣)或写一个方法来生成这些值(不是那么有趣)。我认为这两种方法的结合可能是最重要的。轻轻地,引入税收制定部门百分比的概念。

(原文:I've read your edit, but I've also just built a linear regression script and found an underlying pattern for a 1600 line block in the values. Now I'm torn between making accountants handcrank 1800 lines of changes every year (very fun) or writing a method to generate those values (not so fun). I think a combination of both methods may be the order of the day. That, and gently introducing the tax formulation department to the concept of percentages.)

网友:我明白了,这将是伟大的如果它是线性的。我现在不知道您的应用程序的细节。你能它能大大提升效率的明年如果你知道会发生什么。或者如果可能的话,你可以请求excel或csv文件税法和税率(每年提供)和存储在你的数据库文件的内容。无论如何,我在黑暗中射击,因为我不知道你现在正在与。

(原文:I see, it would be great if it is linear. I currently have no idea what the details of your application are. You can automize it if you know what will happen the next year. Or if possible, you can request excel or csv files with all tax codes and rates (provided on a yearly basis) and store the content of the files in your DB. Anyway, I'm shooting in the dark here, because I have no idea what you are working with right now.)

网友:不用担心伴侣。你给了我很多好的想法。我现在硬编码的哈希代码并生成大部分是通过一个算法,因为我们在一个最后期限,但最终我们会搬到一个数据库存储系统的建议。csv上传想法似乎是最简单的和未来的证明方法来解决这种不需要开发人员输入。多谢。

(原文:No worries mate. You've given me plenty of good ideas. I'm hardcoding the hash in code for now and generating most of it via an algorithm because we're on a deadline, but eventually we'll move to a db-stored system as you suggested. The csv upload idea seems the be the simplest and future proof way to solve this without requiring developer input. Many thanks.)

网友:第二部分我upvoted你的回答,但是仅供参考哈希解决方案是行不通的。它返回。哈希查找不工作一样表达式(如@MilesStanfield的回答)。上面的你将不得不使用散列。

(原文:I upvoted your answer for the second part, but FYI your hash solution doesn't work. It returns nil. Hash lookups don't work the same way as case expressions (as in @MilesStanfield's answer). To get 0.4 out of the above hash you'll have to use COUNTRY_TAX_RATES[0..30].)

解决方案:
我将使用一个case语句与范围
原文:

I would use a case statement with ranges

case @var
when 0..30 then 0.4
when 31..50 then 0.7
when 51..70 then 1.1
...
end
解决方案:
假设是收入和一组对,,意义(收入“断点”)的arr的第一个元素是零;,断点意义还在不断增加;税收支付如果;和税收,如果付款。税收可以计算自断点正在增加。
原文:

Suppose inc is income and arr is an array of pairs [bp, tax], where

  • arr[0][0] = 0, meaning bp (income "breakpoint") for the first element of arr is zero;
  • arr[i][0] < arr[i+1][0], meaning breakpoints are increasing;
  • tax arr[-1][1] is payable if inc >= arr[-1][0]; and
  • tax arr[i][1] is payable if arr[i][0] <= inc < arr[i+1][0], 0 <= i <= arr.size-2.

Tax can then be computed

arr.find { |bp,_| inc >= bp }.last

since the breakpoints are increasing.