Skip to main content

Analytic function to the rescue! Again.

My current APEX project has a requirement to show a chart on one of the pages. No big deal. Usually. Because it should represent some value over time and that value would be stored in the database....could be every second, this chart could contain 10,000's, if not 100,000's points. 
So generating the XML, transferring the XML to the browser and interpreting the XML by the chart engine....was slow...
So I had to come up with a solution to reduce the number of points, without destroying the goal of the chart. Oh, and did I mention that the value could be stored every second, but could also be every minute, hour, whatever?
The first thing I came up with was the SAMPLE clause. Never heard of it and never used it before. You can just do a SELECT * FROM EMP SAMPLE(10) and as a result, you'll get 10% of the rows of the EMP table. The only withdrawal with that was, the result could be different every time. So when refreshing a chart, the chart could look really different. Another, more minor, hiccup was, the sample size  should be hard-coded and couldn't be parametrised (another "smaller" subrequirement).
So after some research I stumbled upon an Analytic Function, that might do the trick: NTILE( number ). This function "divides an ordered data set into a number of buckets indicated by the number parameter and assigns the appropriate bucket number to each row" (quote from the documentation - couldn't say it better). So using this function, you can equally divide 100,000 records in 25 buckets - order by timestamp. And once you've done that you can easily calculate the average value per bucket. And the average timestamp as well. And just use these values to generate a more minimalistic XML document... O, and another fine thing: you can pass any number parameter to NTILE, so using the same query, generate either 10 or 10,000 points...

As a - functionally useless - example a query to generate 15 rows with the average object_id and the average create date of all_objects:
select  to_char( startdate + ( enddate - startdate )/2
               , 'YYMMDD HH24:MI:SS' ) label
,       average
from
(
select min(created)  startdate
,      max(created)  enddate
,      avg(value)    average
from
(
select object_id value
,      created
,      ntile( 15 ) over ( order by created ) as bucket
from all_objects
)
group by bucket
order by 1
)

Comments

Byte64 said…
Hi Roel,
thanks for the insightful posting.
The real challenge with certain analytic functions for me is to understand their real applications, the examples provided in the documentation usually do not shine for clarity so it becomes hard to imagine when a given function may come in handy in a real situation.

And not to mention the verbose syntax they come with, I don't know if you ever used the clauses concerning how many rows to span, there comes the real fun, I always need to maximize the SQLDeveloper window to make a column fit on a single line and keep the syntax diagram at hand to check the keywords I'm always missing!

Cheers
Flavio

Popular posts from this blog

Refresh selected row(s) in an Interactive Grid

In my previous post I blogged about pushing changed rows from the dabatase into an Interactive Grid. The use case I'll cover right here is probably more common - and therefore more useful!

Until we had the IG, we showed the data in a report (Interactive or Classic). Changes to the data where made by popping up a form page, making changes, saving and refreshing the report upon closing the dialog. Or by clicking an icon / button / link in your report that makes some changes to the data (like changing a status) and ... refresh the report.  That all works fine, but the downsides are: The whole dataset is returned from the server to the client - again and again. And if your pagination size is large, that does lead to more and more network traffic, more interpretation by the browser and more waiting time for the end user.The "current record" might be out of focus after the refresh, especially by larger pagination sizes, as the first rows will be shown. Or (even worse) while you…

Dockerize your APEX development environment

Nowadays Docker is everywhere. It is one of the main components of Continuous Integration / Continuous Development environments. That alone indicates Docker has to be seen more as a Software Delivery Platform than as a replacement of a virtual machine.

However ...

If you are running an Oracle database using Docker on your local machine to develop some APEX application, you will probably not move that container is a whole to test and production environments. Because in that case you would not only deliver a new APEX application to the production environment - which is a good thing - but also overwrite the data in production with the data from your development environment. And that won't make your users very excited.
So in this set up you will be using Docker as a replacement of a Virtual Machine and not as a Delivery Platform.
And that's exactly the way Martin is using it as he described in this recent blog post. It is an ideal way to get up and running with an Oracle database …

apex_application.g_f0x array processing in Oracle 12

If you created your own "updatable reports" or your custom version of tabular forms in Oracle Application Express, you'll end up with a query that looks similar to this one:
then you disable the "Escape special characters" property and the result is an updatable multirecord form.
That was easy, right? But now we need to process the changes in the Ename column when the form is submitted, but only if the checkbox is checked. All the columns are submitted as separated arrays, named apex_application.g_f0x - where the "x" is the value of the "p_idx" parameter you specified in the apex_item calls. So we have apex_application.g_f01, g_f02 and g_f03. But then you discover APEX has the oddity that the "checkbox" array only contains values for the checked rows. Thus if you just check "Jones", the length of g_f02 is 1 and it contains only the empno of Jones - while the other two arrays will contain all (14) rows. So for processing y…