-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC]FamilySeer: A new search method for Auto-scheduler #57
base: main
Are you sure you want to change the base?
Conversation
Thanks for the RFC. Will review next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for the RFC. Please see my comments.
Moreover, it would be good to have a tracking issue with a PR plan, which could have a list of PRs to be filed. In addition to the major change and unit tests, you should also have a PR working on tutorials, documents, etc.
|
||
# start tuning | ||
#tuner.tune(tune_option) #add new parameter to tune function | ||
tuner.tune(tune_option,search_policy="sketch.xgb.family_op") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The policy name could be more informative. family_op
is not a common term after all.
|
||
``` | ||
|
||
The `tuner` loads the `tune_option` into the `tune` function. There are several parameters in the `tune` function (Refer to class [Taskscheduler](https://tvm.apache.org/docs/reference/api/python/auto_scheduler.html?highlight=taskscheduler#tvm.auto_scheduler.TaskScheduler)). Users can enable our method by changing the `search_policy` parameter to `sketch.xgb.family_<family_algorithm>`. We currently provide two family algorithms as an option: `op` refers to classifying subgraphs based on the core operation, and `hash` refers to classifying subgraphs based on operation sequence. We recommend using `op` to achieve better performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By this description, op
seems not an accurate term. IIUC, both op
and hash
target to subgraphs. The only different is the way to determine the similarity, so again, it would be good to have a better naming.
In addition, since you recommend to use op
for better performance, could you elaborate the need of hash
?
else: | ||
family_group[task_hash].append(idx) | ||
|
||
elif class_type == "ind": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ind
the third version? I didn't find it in the previous section.
if class_type == "op": | ||
for idx, task in enumerate(tasks): | ||
task_layers = task.desc.split('_') | ||
if task_layers[1] not in family_group: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that this is a bit too ad-hoc:
- It relies on the task.desc, which is supposed to be used only for user reference and the format isn't guaranteed.
- It identifies the second op (e.g.,
task_layers[1]
) to be the anchor op, but this may not 100% hold.
|
||
The accuracy of the cost model determines the search quality, but Ansor uses monolithic cost model to predict different computation graphs (subgraphs), resulting in an accuracy loss during tuning. | ||
|
||
The task scheduler allocates most of the time budget to subgraphs with most improving potential (i.e., those with the highest latency). This approach works well at the beginning of the autotuning. However, as the potential subgraph gradually reaches its peak performance with adequate time budget, other subgraphs have little time budget to reach its peak performance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, the solution to this problem is to improve the task scheduler strategy. Since the task scheduler predicts the peak performance of a task, as long as this prediction is accurate enough, it shouldn't spend more time on the task that has achieved the peak performance.
…… | ||
``` | ||
|
||
The foresee tuning takes `task_idx_groups` (A list of subgraph families) and `skip_measures_per_round` as inputs and tunes all the subgraphs inside the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is skip_measures_per_round
?
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
When searching on a larger search space (such as larger batch size), FamilySeer performs similarly or sometimes worse than Auto-scheduler. This is because a larger search space requires more time before the cost model can provide an accurate prediction. Deploying an inaccurate cost model on Foresee tuning may result in spending time budget on non-improvement code transformations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that this can be improved somehow. For example, at the warmup stage (e.g., the first N trials), all cost models share the same training data from all tasks so that you have the same behavior as the current auto-scheduler. Afterward, you apply different data to each cost model to benefit from the task groups.
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
Please refer to [this paper](https://arxiv.org/abs/2201.00194). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a "prior" art lol
You should just put this link to the summary section. And for this section, you should put other related work (e.g., Ansor, FlexTensor, AutoTVM, etc).
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
Our search method is up for [discussion](https://discuss.tvm.apache.org/t/rfc-familyseer-a-new-search-method-for-auto-scheduler/11877). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an unresolved question. Unresolved questions should be the technical difficulty or the drawbacks of the proposed approach.
- Feature Name: (FamilySeer: A new search method for Auto-scheduler) | ||
- Start Date: (2021-01-07) | ||
- RFC PR: [apache/tvm-rfcs#57](https://github.com/apache/tvm-rfcs/pull/57) | ||
- GitHub Issue: [apache/tvm#9875](https://github.com/apache/tvm/pull/9875) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link you put is the pull request but not the tracking issue. Please open a new issue to track the PR progress and put the link here.
RFC topic in forum: https://discuss.tvm.apache.org/t/rfc-familyseer-a-new-search-method-for-auto-scheduler/11877
@comaniac @junrushao1994