SQL General Queries

Last Post 06 Sep 2010 09:34 AM by Vijay12345. 2 Replies.
AddThis - Bookmarking and Sharing Button Printer Friendly
  •  
  •  
  •  
  •  
  •  
Sort:
PrevPrev NextNext
You are not authorized to post a reply.
Author Messages Resolved
Vijay12345
New Member
New Member

--
29 Jul 2010 05:36 AM
hi ,

     I am facing a problem while trying to find some unwanted junk charactors from a column in a table from a Sybase IQ database.

There is table called County and it has a column called End Date. This field is loaded by using raw file and there is a chance that, this End Date field is po***ted with some junk charactors(E.g. ., % , ^ etc).  

The date format would be YYYYMM(E.g. 201000). My intention here is, i have to read the all the charactors in the string, if any of the junk charactors or alphabets in this string it should give the count of that. I already tried the operators like 'NOT SIMILLAR' , 'NOT EXISTS', NOT LIKE".None of the perators are not working.

The query i used sofar:

SELECT X, Y FROM hpi3.hpi_county WHERE Y NOT LIKE '(0-9)'

SELECT count(distinct X) FROM hpi3.hpi_county WHERE Y NOT LIKE '(0-9)'

SELECT X, Y FROM hpi3.hpi_county WHERE Y NOT REGEXP '(0-9)'

SELECT X, Y FROM hpi3.hpi_county WHERE Y REGEXP_SUBSTR '(0-9)'


Thanks in advance,
Vijay
russellb
New Member
New Member

--
29 Jul 2010 08:54 AM
select * from hpi3.hpi_county where y like '%[^0-9]%'
Vijay12345
New Member
New Member

--
06 Sep 2010 09:34 AM
hi Russel, thanks for the solution provided. Sorry i am sending reply in delay. This has helped during my validation. Now i have come up with one more problem to trouble you guys .


The problem is this and here you go.

I have created a table called 'REAS_HPI_STATE_Source'. In this table i have few columns as mentioned below.

State_Code, YYYYMM, Tier_Code, Home_Price_Index, Rpt_Sales_Med, Rpt_Sales_Cnt, Start_Date, End_Date, Std_Dev, Volatility_Index, Dis_Nondis_Flag, Last_Update_Date, HPI_1M_Ago, HPI_3M_Ago, HPI_6M_Ago, HPI_12M_Ago, HPI_1M_Change, HPI_3M_Change, HPI_6M_Change, HPI_12M_Change, Period, period_no

If you look at the above column names, there are Ago value column where in i have to po***te those values based on Period No.Currently i am using following programming method where in i am po***ting these ago values.

REAS_Dim_Periods is the table where i have period information which is used for po***ting the Period in target table.

Code snippet: for 1M Ago po***tion. Similar code is used for 3, 6 and 12M Ago po***tion also.

Declarations:

Declare @SQL_State_1m_Ago nvarchar(1000), @SQL_State_3m_Ago nvarchar(1000)
Declare @SQL_State_6m_Ago nvarchar(1000), @SQL_State_12m_Ago nvarchar(1000)
Declare @Period_wise_update_State table ( si_no smallint identity , period_no smallint)
INSERT INTO @Period_wise_update_State
select distinct period_no from REAS_HPI_STATE_Source
declare @period_no_max_State smallint
declare @period_no_min_State smallint
select @period_no_max_State = max(si_no) from @Period_wise_update_State
declare @period_no smallint

Begin
Set Nocount On
begin
Set @SQL_State_1m_Ago=('select a11.tier_code tier_code, a12.period_no period_no, a11.State_code State_code, max(a11.home_price_index) HPI_m_ago into State_Temp1 from REAS_HPI_STATE_Source a11 join REAS_Dim_Periods a12 on (a11.period_no = (a12.period_no - 1)) group by a11.tier_code, a12.period_no, a11.State_code')
Exec (@SQL_State_1m_Ago)
End

--ReSetting the Min value = 1

SET @period_no_min_State = 1
 Begin
while @period_no_min_State <= @period_no_max_State
BEGIN Update A
set HPI_1m_ago = B.hpi_m_ago
From REAS_HPI_STATE_Source A
Join State_Temp1 B on A.period_no=B.period_no and A.tier_code=B.tier_code and A.State_code=B.State_code
and A.period_no = @period_no
select @period_no = period_no from @Period_wise_update_State where si_no = @period_no_min_State
select @period_no_min_State = @period_no_min_State + 1
END

Dropping temp table:

Begin
If Object_Id ('State_Temp1') is NOT NULL
begin
drop table State_Temp1
end

This code is working absolutely fine. But the problem here is, it is taking more than 20 minutes which is not feasible for my validation. I need this to be optimized. SO that the whole execution should take max 5 to 6 minutes. I heard from someone that, indexes can be used for optimizing this. But i am little confused that which type of the index i need to define on this table for which columns. At this juncture i am thinking like to define a Unique nonclustered index because i have three columns data which are unique among all remaining columns. The unique combination is as follows.
State Code, Period and Tier.

Please let me know my approach is right OR wrong. If i am creating index, i have a doubt on this......

I create a index on the table. If i am truncating this table and reloading it, whatever the index i created are gone and i have to define newly OR my old index only works? Because all my table operations are truncate and load.

My whole intention is reduce the execution time to max 5 OR 6 minutes. The total number of data i have on this table is 258960 which is pretty small data set when you consider this as Datawarehouse.

I am anti***ting a quick reply because this is high priority item which is in my plate for long time and my manager is keeps on bugging me on this .
Sorry for the trouble whatever i am giving you guys.

Thanks in advance,
VJ
You are not authorized to post a reply.

Acceptable Use Policy